Skip to main content

GitHub Issues

The kapa platform provides and integration to ingest GitHub issues as a data source. A common pattern for developers searching for a solution is to look into the GitHub issues of a project after not finding anything in the official documentation. GitHub issues contain workarounds, explanations, open discusions and much more information helpful to augment your documenation. Hence, we always recommend to connect your GitHub issues as a data source but there are some caveats to be aware of.

Set up

Step 1: Connect your repository

Connect your GitHub repository by filling out the Owner and Name input fields. You can only connect public repositories.

Step 2: Filter your issues

You can use the following optional parameters to only select a subset of your issues to include.

  • Issue state: Only include issues that have a certain state (open, closed)
  • Issue age: Only include issues that have been updated within the specified time range
  • Include issue labels: Only include issues that have one of the specified labels.
  • Exclude issue labels: Exclude issues that have one of the specified labels.
GitHub Config

Best Practices

Start with all issues

When you include GitHub issues in your data sources for the first time we recommend you ingest all of them as a starting point. By reviewing kapa's conversations in production you will quickly notice if this is not ideal. Issues with certain labels might be irrelevant or misleading, issues of a certain age might be too outdated and contain false information or closed issues might generally be irrelevant for users of your project. All these things are easy to understand once you are running in production but hard to guess beforehand. Hence, we recommend you start with all of your issues and then use the filters until you are happy with kapa's behaviour in production.

It is hard for us to recommend a general set of filters, what works well is highly project dependent.

Use a dedicated label to exclude outdated issues

A common problem is that outdated issues are hard to exclude with a general Issue age filter. An issue from three years ago might still be highly relevant while a three month old issue now contains false information because of a new release. A manual but effective solution to this problem is to create a new dedicated label for excluding outdated issues e.g. exclude-kapa and exclude it via the Exclude issue labels filter.

Review your conversations in the kapa dashboard periodically and when you notice kapa giving a false or undesired answer because of an outdated GitHub issue manually apply the exclude-kapa label to it. Through this manual workflow you will over time weed out bad content from your GitHub issues without loosing good content by applying too broad of a filter.

Potentially split your GitHub issues into multiple sources

It might be too difficult to select an effective subset of GitHub issues using a single source. For example, issues with label Bug are relevant as long as they are open no matter their issue age. However, once they are closed it makes no more sense to include them. On the other hand issues with label Discussion are ususally relevant if they are no older than 2 years but their status does not matter. In these scenarios it makes sense to create multiple sources to be able to apply effective filtering.