GitHub Issues
The Kapa platform provides an integration to ingest GitHub issues as a data source. A common pattern for developers searching for a solution is to look into the GitHub issues of a project after not finding anything in the official documentation. GitHub issues contain workarounds, explanations, open discussions and much more information helpful to augment your documentation. Hence, we always recommend connecting your GitHub issues as a data source, but there are some caveats to be aware of.
Prerequisites
- A GitHub repository containing issues
- Repository owner and name information
- For private repositories, a personal access token with appropriate permissions
Data ingested
When you connect Kapa to GitHub Issues, the following data is ingested:
- Issue URLs
- Issue titles and body content
- Comments and discussion threads on issues
- Issue status
- User information (anonymized)
Permissions required
The following permissions are required when using a personal access token for private repositories:
Permission | Purpose | Security considerations |
---|---|---|
Issues: read-only | Enables access to issues, comments, and labels | Kapa cannot create or modify issues |
We recommend using a fine-grained access token limited to only the repositories you want to connect to Kapa.
Setup
Step 1: Connect your repository
- Go to the Sources tab on the Kapa platform and click on Add new source
- Enter a name for the source, select GitHub Issues, and click Continue
- Specify the GitHub repository to use by filling in the Owner and Name fields
- If it's a private repository, enter a personal access token for authentication
- Upon successful connection, a purple text box appears, providing you with the repository description
Step 2: Filter your issues
Configure the following optional parameters to only select a subset of your issues to include:
- Choose Issue state to include open issues, closed issues, or both
- Set an Issue age filter to only include recently updated issues
- Select Include issue labels to target specific issue categories
- Select Exclude issue labels to filter out irrelevant issues
- Click Save to begin the ingestion process
Configuration options
The following configuration options are available for the GitHub Issues integration:
Option | Description | Default | Required |
---|---|---|---|
Owner | GitHub username or organization that owns the repository | None | Yes |
Name | Name of the GitHub repository | None | Yes |
Personal access token | Token for authenticating to GitHub (for private repositories) | None | For private repos |
Issue state | Filter issues by state (open, closed, or both) | Both | No |
Issue age | Only include issues updated within the specified time range | All time | No |
Include issue labels | Only include issues with one of the specified labels | All labels | No |
Exclude issue labels | Exclude issues with any of the specified labels | None | No |
Best practices
Experiment with filtering
It's difficult to recommend a general set of filters, as what works well is highly project dependent. Issues with certain labels might be irrelevant or misleading, issues of a certain age might be too outdated and contain false information, or closed issues might generally be irrelevant for users of your project.
By experimenting with different filter settings and reviewing Kapa's conversations in production, you can determine what the right level is for your repository.
Use a dedicated label to exclude outdated issues
A common problem is that outdated issues are hard to exclude with a general
Issue age filter. An issue from three years ago might still be highly
relevant while a three-month-old issue now contains false information because
of a new release. A manual but effective solution to this problem is to create
a new dedicated label for excluding outdated issues (e.g., exclude-kapa
) and
exclude it via the Exclude issue labels filter.
Review your conversations in the Kapa platform periodically and when you notice
Kapa giving a false or undesired answer because of an outdated GitHub issue,
manually apply the exclude-kapa
label to it. Through this manual workflow,
you can weed out bad content from your GitHub issues without losing good
content by applying too broad of a filter.
Potentially split your GitHub issues into multiple sources
It might be too difficult to select an effective subset of GitHub issues using a single source. For example, issues with label Bug are relevant as long as they are open no matter their issue age. However, once they are closed it makes no more sense to include them. On the other hand, issues with label Discussion are usually relevant if they are no older than 2 years but their status does not matter. In these scenarios, it makes sense to create multiple sources to be able to apply effective filtering.
Troubleshooting
- Authentication failures: Verify your personal access token has the required permissions and hasn't expired
- No issues appearing: Check that your repository contains issues matching your filter criteria
- Outdated information: Use the dedicated label approach to exclude specific outdated issues