Hosted MCP server
Deploy a hosted MCP server for your Kapa project in a single click to expose your knowledge sources over MCP. It can be used in three ways:
- External users in AI tools and editors: More accurate AI responses and a better developer experience: agents in tools like Cursor, Claude Code, or VS Code have up-to-date context about your product, so external users can query your documentation without leaving their editor. This is intended for projects that expose public information only.
- Internal teams in AI tools: More accurate AI responses for your own team: employees can access documentation and internal knowledge sources from tools like ChatGPT or Claude, so GTM, support, and solutions teams can get answers without switching context. Access is restricted to team members with Kapa accounts.
- Connect your agent to your docs: Your agents can call the semantic search tool to tap into deep, product-specific knowledge from your docs and guides, so they can explain features, answer questions, and help users get work done.
Setup and configuration
To set up an MCP server for your Kapa instance:
- In Kapa, click Integrations > + Add new integration.
- Choose Hosted MCP Server.
- Click Continue.
- Configure the Subdomain: This becomes the first part of the URL clients use to connect to your MCP server, in the form
<subdomain>.mcp.kapa.ai. - Configure the Server name. This becomes the MCP server label (server_name / serverLabel) that clients see when listing or calling tools from this server; it does not affect the URL or subdomain.
- Choose the Authentication type. See Authentication for details on each type.
There are additional elements that can be configured:
- Server instructions: Custom instructions for the MCP server.
- Semantic retrieval tool name and description: How the tool appears to AI tools and agents, which is what they use to decide when to call it. See Customizing the tool name and description for when to change these and when to leave the default.
- Source groups: Restrict the server to only return results from specific source groups. When configured, the server only searches sources in the selected groups (plus any global sources), regardless of what clients request. See Hosted MCP server configuration for details.
Kapa provides a default configuration that is suitable for most use cases. You can customize the tool name and description under Advanced configuration in your MCP integration settings. To change the server instructions, contact support@kapa.ai.
Authentication
Public (OAuth)
Use this for external users in AI tools and editors. Your server is publicly accessible, but Kapa requires users to authenticate with a Google or GitHub account.
When a user connects for the first time, Kapa presents a provider picker where the user can choose to sign in with Google or GitHub. The selected provider completes an OAuth login. Kapa uses the anonymous user ID from the chosen provider only to enforce per-user rate limits and prevent abuse of your MCP server.
Google
- Kapa requests only the
openidscope. - Receives an ID token (a JWT) that contains a stable, opaque user ID (
sub). - Does not request the
emailorprofilescopes, so Kapa does not see the user's name, email address, or other personal data.
On the Google consent screen, this appears as Associate you with your personal info on Google.
This is Google's generic wording for the openid scope: it means the app can recognize that the same Google account is signing in again. It does not grant access to the user's email, name, contacts, or other data, which would require additional scopes such as email or profile.
GitHub
- Kapa requests no OAuth scopes.
- With no scopes requested, GitHub grants read-only access to public profile information only.
- Kapa uses the stable, opaque GitHub user ID (
id) solely for rate limiting. - Does not access repositories, organizations, email addresses, or other GitHub data.
API key
Use this to connect your agent to your docs. Your server requires a project API key via the Authorization header on every request.
Authorization: Bearer <YOUR_API_KEY>
How you set this header depends on the MCP client or agent framework you use, but in all cases you must: Keep the API key in your backend. Never expose it in client-side code or send it to the browser.
Internal (Kapa account)
Use this for internal teams in AI tools. Your server is restricted to employees with a Kapa account.
When a user connects to your internal MCP for the first time, they are directed to the Kapa login page. The user then logs in with their Kapa account, using whichever authentication methods permitted for your Kapa instance.
To access the internal MCP server, the user account must have the Use Internal Chat Assistant permission for the project. Refer to Roles and permissions for more information on managing project permissions.
Rate limits
| Authentication type | Per-user limit | Per-team limit |
|---|---|---|
| Public (OAuth) | 300 requests per day | 60 requests per minute |
| Internal (OAuth) | 300 requests per day | 60 requests per minute |
| API key | — | 60 requests per minute |
If you need higher limits, contact support@kapa.ai.
Latency
The MCP server wraps the Retrieval API and has the same typical latency:
- p50: ~3 seconds
- p95: ~4.5 seconds
This is higher than a simple embedding-based or keyword search because the underlying retrieval pipeline is multi-step and tuned for high recall. For each tool call, the pipeline runs multiple search iterations using both embedding-based and sparse retrieval. Query decomposition and keyword generation produce the queries for these iterations, and reranking is used to refine relevance. The tradeoff is fewer missed-but-relevant chunks at the cost of higher latency.
Enabling use_pruning adds one more model call at the end of this pipeline, which costs roughly another 0.7 seconds per query.
Tools
Kapa's hosted MCP server exposes a single semantic search tool:
search_<PRODUCT_NAME>_knowledge_sources
This tool lets AI tools / agents perform semantic retrieval over your product's documentation and other knowledge sources.
This tool:
- Searches all knowledge sources connected to your Kapa project for a given query.
- Returns the most relevant chunks, in descending order of relevance.
- Each chunk is a short, self-contained snippet of text taken from a single page or item (for example, part of a documentation page).
Results are returned as a structured list of objects with:
source_url– the URL of the original source.content– the chunk content in Markdown.
If you cannot use an MCP server, the Retrieval API endpoint provides the same functionality via a standard HTTP API.
Customizing the tool name and description
By default, Kapa exposes the semantic search tool as search_<PRODUCT_NAME>_knowledge_sources with a description written to work well across clients. This default is a good fit for most projects, so change the tool name or description only when you have a concrete reason to (see When customizing helps); otherwise, leave it as is. You can change both in your MCP integration settings, under Advanced configuration, where the current default description is also shown. Before you do, it helps to understand what the model actually sees.
How an agent sees the tool
When a client connects, it loads the tool's name, description, and input schema into the model's context window, alongside every other tool available in that client or agent. During generation, the model relies on that text, and only that text, to decide whether to call your tool, which tool to call, and what to pass as the query.
The name and description therefore determine whether your knowledge base is used at all.
When customizing helps
- The tool searches something other than your product documentation. The default description is written for the most common setup, indexing the knowledge sources that document your product, so it is framed around your product. If your server instead indexes a different collection, for example internal go-to-market material or the documentation for your engineering team's upstream dependencies, that framing no longer describes what the tool searches. Rename the tool and rewrite the description so they tell the client what the collection actually contains.
- Your users and agent use different vocabulary than the default. If people refer to your documentation by a specific name, or your product has terms the default does not mention, adding those terms to the description improves the chance the model reaches for the tool on a relevant question.
- You want to frame the tool's job differently. By default the description tells the model to use the tool for questions about your product. You might want a different policy, for example calling it regardless of user intent, or using it only as a last resort. The description is where you set that expectation. This behavior is usually split between the tool description and the surrounding instructions, whether that is the system prompt in an agent you build or a skill in a client like Claude Code, but the tool description often carries part of it.
When you do change the description, keep it concise. It is sent on every request, so a long description spends tokens and can dilute the signal that makes the model choose the tool.
Changing the description is safe at any time: connected clients pick it up on their next tool refresh. Changing the tool name is a breaking change for clients that already cached the old name. Agents will not find the tool until they reconnect and rediscover tools, and any tool name hard-coded in your code or _meta calls must be updated. Treat a name change like an API change that you roll out deliberately.
Programmatic configuration via _meta
When integrating the Kapa MCP server into your agent via code, you can pass additional parameters via the MCP _meta field to control retrieval behavior and track end users. These parameters are not part of the tool's input schema, so they are set directly in your code and are not visible to tool-calling models.
These parameters require an API key authenticated MCP server and are not available for public or internal OAuth-authenticated servers, as they are set by developers at API call time.
Retrieval parameters
| Parameter | Type | Description |
|---|---|---|
use_pruning | boolean, optional | Optionally prune low relevance chunks after retrieval, at the cost of added latency. The number of returned chunks becomes variable and may be significantly lower than top_k; pruning always keeps the 2 most relevant chunks where top_k and max_chars permit. Defaults to false. Read about how it works in How we prune RAG context. |
top_k | integer (1-15), optional | The maximum number of chunks to return. Fewer may be returned if max_chars or use_pruning reduce the result set. Defaults to 15. |
max_chars | integer (1-60000), optional | Maximum number of characters across all returned chunks. Chunks are included in order of relevance, but only up to the point where the total character count stays within this limit. Chunks are never truncated. This is an upper bound, not a target: especially with use_pruning enabled, the returned total may be well below this limit. Defaults to 35,000. |
source_ids_include | array of UUIDs, optional | Only return results from these specific sources. |
source_group_ids_include | array of UUIDs, optional | Only return results from sources in these groups. If the server is also configured with source groups, the intersection of the two lists is used. |
redact_query | boolean, optional | If true, the query text is redacted from analytics. Use for sensitive queries. |
User tracking
To associate queries with end users in your analytics, you can optionally pass a user object. This information appears in your dashboards at app.kapa.ai.
| Parameter | Type | Description |
|---|---|---|
user.email | string, optional | User's email address. |
user.unique_client_id | string, optional | Your own identifier for the user (e.g., an ID from your system), useful for linking Kapa analytics with your internal data. |
user.company_name | string, optional | User's company name. |
user.first_name | string, optional | User's first name. |
user.last_name | string, optional | User's last name. |
Example
The exact pattern for setting _meta parameters will depend on your agent framework and codebase. Here's how you pass them at the call site using the MCP Python SDK:
result = await session.call_tool(
name="search_acme_knowledge_sources",
# Dynamic argument - provided by your agent or user input
arguments={"query": "How do I configure SSO?"},
# Meta parameters - preset configuration controlled by your code
# The SDK automatically maps this field to `_meta` in the JSON-RPC request
meta={
"top_k": 5,
"max_chars": 35_000,
"source_ids_include": ["550e8400-e29b-41d4-a716-446655440000"],
"source_group_ids_include": ["86ee1d82-d96e-4219-9290-b2a07d3abd8d"],
"user": {
"email": current_user.email,
"unique_client_id": current_user.id,
},
},
)
For raw JSON-RPC requests (e.g., from n8n), use _meta (with underscore) directly:
{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "search_acme_knowledge_sources",
"arguments": {"query": "How do I configure SSO?"},
"_meta": {
"top_k": 5,
"max_chars": 35000,
"source_ids_include": ["550e8400-e29b-41d4-a716-446655440000"],
"source_group_ids_include": ["86ee1d82-d96e-4219-9290-b2a07d3abd8d"]
}
},
"id": 1
}
Use cases
External users in AI tools and editors (public MCP)
Your hosted MCP server works with any MCP-compatible AI tool or editor so developers can query your docs without leaving their workflow.
Share your MCP with your external users
If you already have a Website Widget live, you should advertise your MCP server via the MCP install menu in the widget header.
To enable it, follow the steps in Website Widget | Configuration → MCP install menu, which covers all required widget attributes, examples, and behavior.
You should also add a dedicated page to your documentation with setup instructions for popular AI tools. See the page describing the Kapa MCP server for an example.
Share your MCP on social media, in developer newsletters, or alongside other help resources in your docs. See our guide on Driving Users to MCP for a full playbook.
See the installation instructions for Kapa's own documentation MCP server as an example of how to install Kapa Hosted MCP servers in various clients.
Connect your agent to your docs (API key access)
If you're building an agent, whether with the Kapa Agent SDK, LangGraph, OpenAI Agents, or your own framework, you can connect it to your knowledge base via MCP. The semantic search tool gives your agent access to your docs and guides so it can answer product questions alongside its other capabilities.
The Kapa Agent SDK provides a complete in-product agent with streaming, custom tools, human-in-the-loop approval, and a full chat UI out of the box. Knowledge base search is built in, no MCP setup needed.
What agents typically do
Common patterns include:
- Data and operations: Run queries, inspect entities, or pull recent activity using your own APIs.
- Creating assets and workflows: Generate dashboards, data pipelines, or analytics notebooks on behalf of the user.
- Debugging and guidance: Investigate failed runs or errors and return a structured explanation or fix plan.
- Product help: Answer "How do I…?" questions about features, configuration, and best practices.
Agents are typically powered by a reasoning model that can call tools, often orchestrated by an agent framework such as OpenAI's Agent Builder, LangGraph, or a lightweight in-house orchestration layer.
Follow our Connect your agent to your docs with MCP tutorial for a hands-on walkthrough using LangGraph and Kapa's MCP server. Once connected, see Tuning knowledge base search for your agent for guidance on retrieval size, pruning, and prompting.
In this setup:
- Native tools handle your own functionality (APIs, mutations, queries, object creation, and so on).
- The Kapa semantic search tool provides product-specific knowledge from your knowledgebase.
Why they need knowledge base context
In practice, in-product agents almost always need access to your own docs and knowledge sources:
- Users naturally ask knowledge questions ("How do I set this up?", "What does this error mean?"") directly in the chat, and those answers live in your docs.
- Interactions with native tools often trigger follow-up questions ("Why did this query fail?", "What does this setting do?") that require documentation to explain.
- The reasoning model itself sometimes needs docs to use tools correctly. For example, learning how to write queries in a product-specific language, understanding valid parameters, or interpreting and fixing errors returned by other tools.
The Kapa semantic search tool gives your agent this context.
Internal teams in AI tools (internal MCP)
Expose documentation and internal, non-public knowledge sources to your own team inside AI tools like ChatGPT or Claude. Your hosted MCP server works with any MCP-compatible AI tool, allowing employees - such as Solutions Engineers, Customer Success, Sales, and Support - to access accurate, product-specific knowledge without leaving their workflow.
Access is restricted to employees with Kapa accounts, limiting access to projects that include internal knowledge sources.
Connecting your hosted MCP to your AI tools
Follow Kapa's tutorial to Connect your internal hosted MCP to ChatGPT or Claude.
About MCP
If you're new to MCP, the official documentation has a good introduction.
MCP (Model Context Protocol) is an open-source standard for connecting AI applications to external systems. (Source: What is the Model Context Protocol (MCP)?)
This means that AI assistants and agents can access knowledge and tools beyond their own training data and capabilities. For example:
- AI coding tools like Claude Code and Cursor can connect to a documentation MCP to get up-to-date comprehensive knowledge of a product.
- An AI agent can connect to your calendar, so it can act as a personal assistant.
- An internal company chatbot can connect to company databases and wikis, providing a single interface to pull together datasets.
Kapa's MCP can be used with AI assistants like Claude Code and Cursor, as well as to connect your own agents to your knowledge base.