GitHub Code
Using GitHub Code as a source allows Kapa to pull documentation and source code directly from your GitHub repository. Documentation often lags behind code changes, and source code contains implementation details, edge cases, and nuances that written documentation cannot fully capture. By ingesting your codebase, Kapa can answer a new category of technical questions, such as API usage patterns and configuration options that may not be explicitly documented.
When you connect a repository, Kapa continuously tracks it for changes. Any new or updated files matching your configured file types, directories, and filters are automatically ingested, keeping your AI assistant in sync with your latest code. See data source refreshes for details on update frequency.
When to use GitHub Code
GitHub Code works best for code that's relevant to the questions your users ask. Thoughtfully selecting high-value repositories and directories increases the signal and usefulness of code as a source.
Good candidates:
- SDKs and client libraries
- Example repositories and code samples
- Reference implementations
- Configuration files and schemas
- User-facing portions of larger codebases
What to avoid:
- Test files (often noisy and not useful for answering questions)
- Generated code (build outputs, compiled files)
- Dependency directories (
node_modules,vendor) - Internal implementation details not relevant to users
Supported formats
Documentation files
| Format | Extensions |
|---|---|
| Jupyter Notebooks | .ipynb |
| Markdown | .md |
| MDX | .mdx |
| Text | .txt |
Source code files
| Language | Extensions |
|---|---|
| C/C++ | .c, .cpp, .cc, .cxx, .h, .hpp |
| Go | .go |
| Java | .java |
| JavaScript | .js, .jsx, .mjs, .cjs |
| Python | .py |
| Rust | .rs |
| TSX | .tsx |
| TypeScript | .ts, .mts, .cts |
Need support for another language? Contact us to request it.
Example: Setting up a Python SDK
Here's a practical example of configuring GitHub Code for a Python SDK repository:
Repository structure:
acme-python-sdk/
├── src/
│ └── acme/
│ ├── client.py
│ ├── models.py
│ └── exceptions.py
├── examples/
│ ├── basic_usage.py
│ └── advanced_config.py
├── tests/
│ └── ...
├── docs/
│ └── ...
└── README.md
Possible configuration:
- File Types*: Python, Markdown
- Directories: Select
src/,examples/, and root (forREADME.md)
This setup gives Kapa access to your SDK implementation, usage examples, and README. Tests are excluded because the tests/ directory wasn't selected. As new Python or Markdown files are added to src/ or examples/, or your README changes, they'll automatically be included on the next refresh.
How code appears in answers
When Kapa references code in its answers, it cites the specific file and line numbers, linking directly to the source:

Clicking a citation takes you directly to the file on GitHub:

Setup
To connect a GitHub repository, you'll need:
- The repository owner and name
- For private repositories: a personal access token with Contents: read-only permission
Step 1: Connect your repository
- Go to the Sources tab on the Kapa platform and click Add new source
- Enter a name for the source, select GitHub Code, and click Continue
- Specify the GitHub repository by filling in the Owner and Name fields
- For private repositories, enter your personal access token
- Upon successful connection, a purple text box appears with the repository description

Step 2: Configure file selection
- Select File Types to choose which formats to include (e.g., Markdown, Python, TypeScript)
- Use the Directories picker to select specific directories from your repository
- Click Save to begin the ingestion process

For more granular control, expand the advanced options to configure regex patterns for file inclusion/exclusion.
Configuration reference
| Option | Description | Default | Required |
|---|---|---|---|
| Owner | GitHub username or organization that owns the repository | None | Yes |
| Name | Name of the GitHub repository | None | Yes |
| Personal access token | Token for authenticating to GitHub (for private repositories) | None | For private repositories |
| Branch or Tag | Pin to a specific branch or tag (e.g., v2.1.0, release/2.0) | Default branch | No |
| File Types | Select specific file types to include (see supported formats) | None | Yes |
| Directories | Select specific directories to include from the repository | All directories | No |
| File Include Regex | (Advanced) Further restrict to files matching this pattern within selected directories | All files | No |
| File Exclude Regex | (Advanced) Exclude files matching this pattern within selected directories | None | No |
Use Branch or Tag to pin your source to a specific SDK version or freeze the code at a known state. This is useful when you want Kapa to answer questions about a specific release rather than the latest code.
Advanced filtering
Using regex patterns
For granular control within your selected directories, use the advanced regex options. Patterns are matched against the file's relative path from the repository root (e.g., src/client/api.py).
Regex patterns only further restrict files within your selected directories. They cannot include files outside those directories.
| Goal | Pattern |
|---|---|
| Exclude test files by naming pattern | Exclude: _test\.py$|\.test\.(js|ts)$ |
| Exclude generated files | Exclude: \.generated\.|\.g\. |
| Include only specific file prefix | Include: ^client_ |
Default exclusions
Kapa automatically excludes common directories that contain dependencies, caches, or build artifacts:
| Category | Excluded patterns |
|---|---|
| JavaScript | node_modules, .next, .nuxt |
| Python | venv, .venv, __pycache__, __pypackages__, .tox, .pytest_cache, .mypy_cache, .ruff_cache |
| General | .git |
These exclusions are applied automatically, so you don't need to configure them.
Best practices
The most valuable code for Kapa is code that users directly interact with: SDKs, client libraries, and public APIs. Internal implementation details typically add noise without helping answer user questions.
Start small and expand
Begin with a focused subset of your repository, such as the main SDK or a specific examples directory. Evaluate how well Kapa answers questions, then gradually expand your scope if needed.
Combine with documentation sources
GitHub Code works best when combined with other sources like web-crawled documentation. The documentation provides context and explanations, while the code provides concrete implementation details and examples.
Troubleshooting
| Problem | Solution |
|---|---|
| Too much noise in answers | Your ingestion scope may be too broad. Use the directory picker and exclude patterns to focus on user-facing code. Exclude test files, generated code, and internal implementation details. |
| Missing expected files | Check that the file extension is in the supported formats list, and verify your include/exclude regex patterns aren't filtering out the files. |
| Dot files are disabled | Kapa does not currently support dot files (files starting with ., such as .env.example or .prettierrc). These appear disabled in the directory picker. |
| Authentication failures | Verify your personal access token has Contents: read-only permission and hasn't expired. |
| No files appearing | Check that your repository contains supported file types and that they match your directory and filter criteria. |