Skip to main content

GitHub Code

Using GitHub Code as a source allows Kapa to pull documentation and source code directly from your GitHub repository. Documentation often lags behind code changes, and source code contains implementation details, edge cases, and nuances that written documentation cannot fully capture. By ingesting your codebase, Kapa can answer a new category of technical questions, such as API usage patterns and configuration options that may not be explicitly documented.

When you connect a repository, Kapa continuously tracks it for changes. Any new or updated files matching your configured file types, directories, and filters are automatically ingested, keeping your AI assistant in sync with your latest code. See data source refreshes for details on update frequency.

When to use GitHub Code

GitHub Code works best for code that's relevant to the questions your users ask. Thoughtfully selecting high-value repositories and directories increases the signal and usefulness of code as a source.

Good candidates:

  • SDKs and client libraries
  • Example repositories and code samples
  • Reference implementations
  • Configuration files and schemas
  • User-facing portions of larger codebases

What to avoid:

  • Test files (often noisy and not useful for answering questions)
  • Generated code (build outputs, compiled files)
  • Dependency directories (node_modules, vendor)
  • Internal implementation details not relevant to users

Supported formats

Documentation files

FormatExtensions
Jupyter Notebooks.ipynb
Markdown.md
MDX.mdx
Text.txt

Source code files

LanguageExtensions
C/C++.c, .cpp, .cc, .cxx, .h, .hpp
Go.go
Java.java
JavaScript.js, .jsx, .mjs, .cjs
Python.py
Rust.rs
TSX.tsx
TypeScript.ts, .mts, .cts

Need support for another language? Contact us to request it.

Example: Setting up a Python SDK

Here's a practical example of configuring GitHub Code for a Python SDK repository:

Repository structure:

acme-python-sdk/
├── src/
│ └── acme/
│ ├── client.py
│ ├── models.py
│ └── exceptions.py
├── examples/
│ ├── basic_usage.py
│ └── advanced_config.py
├── tests/
│ └── ...
├── docs/
│ └── ...
└── README.md

Possible configuration:

  • File Types*: Python, Markdown
  • Directories: Select src/, examples/, and root (for README.md)

This setup gives Kapa access to your SDK implementation, usage examples, and README. Tests are excluded because the tests/ directory wasn't selected. As new Python or Markdown files are added to src/ or examples/, or your README changes, they'll automatically be included on the next refresh.

How code appears in answers

When Kapa references code in its answers, it cites the specific file and line numbers, linking directly to the source:

Code citation with line numbers

Clicking a citation takes you directly to the file on GitHub:

GitHub code view with highlighted lines

Setup

To connect a GitHub repository, you'll need:

  • The repository owner and name
  • For private repositories: a personal access token with Contents: read-only permission

Step 1: Connect your repository

  1. Go to the Sources tab on the Kapa platform and click Add new source
  2. Enter a name for the source, select GitHub Code, and click Continue
  3. Specify the GitHub repository by filling in the Owner and Name fields
  4. For private repositories, enter your personal access token
  5. Upon successful connection, a purple text box appears with the repository description
Connect your GitHub repository

Step 2: Configure file selection

  1. Select File Types to choose which formats to include (e.g., Markdown, Python, TypeScript)
  2. Use the Directories picker to select specific directories from your repository
  3. Click Save to begin the ingestion process
GitHub Code configuration
tip

For more granular control, expand the advanced options to configure regex patterns for file inclusion/exclusion.

Configuration reference

OptionDescriptionDefaultRequired
OwnerGitHub username or organization that owns the repositoryNoneYes
NameName of the GitHub repositoryNoneYes
Personal access tokenToken for authenticating to GitHub (for private repositories)NoneFor private repositories
Branch or TagPin to a specific branch or tag (e.g., v2.1.0, release/2.0)Default branchNo
File TypesSelect specific file types to include (see supported formats)NoneYes
DirectoriesSelect specific directories to include from the repositoryAll directoriesNo
File Include Regex(Advanced) Further restrict to files matching this pattern within selected directoriesAll filesNo
File Exclude Regex(Advanced) Exclude files matching this pattern within selected directoriesNoneNo
tip

Use Branch or Tag to pin your source to a specific SDK version or freeze the code at a known state. This is useful when you want Kapa to answer questions about a specific release rather than the latest code.

Advanced filtering

Using regex patterns

For granular control within your selected directories, use the advanced regex options. Patterns are matched against the file's relative path from the repository root (e.g., src/client/api.py).

info

Regex patterns only further restrict files within your selected directories. They cannot include files outside those directories.

GoalPattern
Exclude test files by naming patternExclude: _test\.py$|\.test\.(js|ts)$
Exclude generated filesExclude: \.generated\.|\.g\.
Include only specific file prefixInclude: ^client_

Default exclusions

Kapa automatically excludes common directories that contain dependencies, caches, or build artifacts:

CategoryExcluded patterns
JavaScriptnode_modules, .next, .nuxt
Pythonvenv, .venv, __pycache__, __pypackages__, .tox, .pytest_cache, .mypy_cache, .ruff_cache
General.git

These exclusions are applied automatically, so you don't need to configure them.

Best practices

Less is more

The most valuable code for Kapa is code that users directly interact with: SDKs, client libraries, and public APIs. Internal implementation details typically add noise without helping answer user questions.

Start small and expand

Begin with a focused subset of your repository, such as the main SDK or a specific examples directory. Evaluate how well Kapa answers questions, then gradually expand your scope if needed.

Combine with documentation sources

GitHub Code works best when combined with other sources like web-crawled documentation. The documentation provides context and explanations, while the code provides concrete implementation details and examples.

Troubleshooting

ProblemSolution
Too much noise in answersYour ingestion scope may be too broad. Use the directory picker and exclude patterns to focus on user-facing code. Exclude test files, generated code, and internal implementation details.
Missing expected filesCheck that the file extension is in the supported formats list, and verify your include/exclude regex patterns aren't filtering out the files.
Dot files are disabledKapa does not currently support dot files (files starting with ., such as .env.example or .prettierrc). These appear disabled in the directory picker.
Authentication failuresVerify your personal access token has Contents: read-only permission and hasn't expired.
No files appearingCheck that your repository contains supported file types and that they match your directory and filter criteria.