Skip to main content

Refreshes

To keep your knowledge up to date, Kapa regularly fetches updates from your sources and syncs changes automatically. The only exception is web crawling, where large changes may require human review.

Data refresh frequency

Different data sources have different data refresh schedules. You can see when Kapa last ingested your sources, and when the next refresh is scheduled, in the Sources view.

The following table shows the update schedules for each source. New content is newly created objects, such as a new JIRA issue or Notion page. Updates are changes to existing items, like comments on an existing issue. Deletions are deletions of entire objects, such as an entire issue or Slack thread.

SourceNew contentUpdatesDeletions
Confluence10 minutes10 minutes48 hours
DiscordTwo hoursTwo hours24 hours
Discourse10 minutes10 minutes48 hours
GitHub discussions10 minutes10 minutes48 hours
GitHub CodeOne hourOne hourOne hour
GitHub issuesFive minutesFive minutes48 hours
GitHub pull requests10 minutes10 minutes48 hours
Google Drive10 minutes10 minutes24 hours
JIRA10 minutes10 minutes48 hours
Jira Service ManagementThree hoursThree hours12 hours
NotionFive minutesFive minutes48 hours
OpenAPIOne hourOne hourOne hour
S3 Bucket10 minutes10 minutes24 hours
Salesforce KnowledgeOne hourOne hour48 hours
SlackSix hoursSix hours24 hours
Web crawling24 hours24 hours24 hours
YouTube24 hours24 hours24 hours
Zendesk helpcenter10 minutes10 minutes10 minutes
Zendesk ticketsFive minutesFive minutesFive minutes

These schedules take into account the rate limits of the third party systems.

Slack

Slack has some specific edge cases. The Slack API doesn't allow pulling changes beyond a given timestamp, and has restrictive rate limits.

To work around this, Kapa scans back seven days, every six hours. In other words, every six hours, Kapa checks all threads from the last seven days and checks if anything was added, changed, or deleted.

This means:

  • Kapa accurately reflects changes every six hours.
  • Kapa can't see changes to older threads. If a thread from eight or more days ago has changes, Kapa won't pick this up.

If you choose to only keep Slack data for a given time period, Kapa deletes threads that fall outside that timeframe every 48 hours.

Web crawling

Every 24 hours, Kapa re-crawls all pages on your configured websites and compares each page's content against what was previously ingested. Only new, modified, and deleted pages are synced downstream.

If more than 30% of pages have changed (including new, modified, and deleted pages) since the last crawl, the update is routed to human review instead of being applied automatically. This works because breaking changes - such as a site redesign, restructuring, or misconfiguration - almost always affect all or a significant portion of your website, which triggers the threshold. Day-to-day content changes typically only affect a small number of pages, so they are applied automatically with high confidence.

Sources awaiting review need to be approved before the data syncs to production.