Conversation API
Guide to the Query and Thread endpoint for creating and managing conversations with kapa.ai The Conversation API provides four routes to create and maintain a conversation:
- Query (Non-Streamed): To ask a question and get an answer.
- Query (Streamed): To ask a question and get a streamed answer.
- Thread (Non-Streamed): To ask a follow up question within an existing conversation and get an answer.
- Thread (Streamed): To ask a follow up question within an existing conversation and get a streamed response.
Below, you will find detailed information on each API route, including the route, example usage (CURL), request parameters, and response examples.
1. Query (Non-Streamed)
Generate an answer for a query. The entire completion is generated before being sent back in a single response.
API Route
GET /query/v1
Example Usage
curl -X GET \\
'\<KAPA_API_ENDPOINT>/query/v1?query=\<YOUR_QUERY>How+do+I+get+started?' \\
-H 'X-API-TOKEN: \<KAPA_API_TOKEN>'
Request Parameters
Parameter | Type | Description |
---|---|---|
query | string | The question to ask kapa.ai |
Response
The response for the Query Endpoint will be a JSON object containing the following fields:
{
"answer": "To get started, please follow our getting started guide...",
"thread_id": "abd4eef3-46e4-46d6-956a-accbed07fa7c",
"question_answer_id": "ea8cdf61-82c1-4469-95be-ca3fa30ec267"
}
Field | Type | Description |
---|---|---|
answer | string | The answer provided by kapa.ai |
thread_id | string | The unique ID of the answer thread |
question_answer_id | string | The unique ID of the question-answer pair |
2. Query (Streamed)
Generate an answer for a query. The streamed response allows the server to send partial responses to the client as soon as they are available (e.g., in "ChatGPT-style typing mode"). This approach is useful for providing the client with relevant sources and the beginning of the completion before the entire completion is finished. The response is sent in chunks over a single connection, allowing for a more interactive and dynamic user experience.
API Route
GET /query/v1/stream
Example Usage
curl -X GET \\
'\<KAPA_API_ENDPOINT>/query/v1/stream?query=\<YOUR_QUERY>How+do+I+get+started?' \\
-H 'X-API-TOKEN: \<KAPA_API_TOKEN>'
Example Usage
See the linked example here of how to create a React-based JS client for consuming the Streamed API.
Request Parameters
Parameter | Type | Description |
---|---|---|
query | string | The question to ask kapa.ai |
Response
The response for the Query Streamed Endpoint is sent as a series of chunks using the "chunked" transfer encoding. Each chunk contains a JSON object that may represent a relevant source, a partial answer, identifiers or an error. The "chunked" transfer encoding allows the server to send partial responses to the client as soon as they are available, without specifying the Content-Length header.
type
: The type of the JSON object in the stream which can berelevant_sources
,partial_answer
,identifiers
, orerror
content
: The content of the JSON object, which depends ontype
:relevant_sources
- a list of source urlspartial_answer
- partially generated answer to user questionidentifiers
- IDs for the thread_id and question_answer_iderror
- reason for error during answer generation
stream_end
: boolean indicator if this is the last chunk of the response
A successful flow of responses contains:
1. A single chunk containing a list of relevant sources
2. Multiple chunks, each containing a partial answer that ultimately "add up" to the whole answer of the user query
3. A single chunk containing the identifier
An unsuccessful flow of responses will include a single chunk of type error, which terminates the response.
Example Response #1: relevant_sources
{
"chunk": {
"type": "relevant_sources",
"content": [
{
"source_url": "docs.example.com"
},
{
"source_url": "docs.other.com"
}
],
"stream_end": false
}
}
Example Response #2: partial_answer
{
"chunk": {
"type": "partial_answer",
"content": {
"text": "To get"
},
"stream_end": false
}
}
Example Response #3: identifiers
{
"chunk": {
"type": "identifiers",
"content": {
"thread_id": "abd4eef3-46e4-46d6-956a-accbed07fa7c",
"question_answer_id": "ea8cdf61-82c1-4469-95be-ca3fa30ec267"
},
"stream_end": true
}
}
Example Response #4: error
{
"chunk": {
"type": "error",
"content": {
"reason": "An error occurred during answer generation."
},
"stream_end": true
}
}
Response Parsing
Most chunks are received separately but this can not not be guaranteed. Each chunk is sent separately by the kapa.ai backend but due to buffering throughout the whole network multiple or partial chunks can be received by your client when consuming the endpoint. To make parsing of this data stream straightforward a non-printable special delimiter U+241E (symbol for record separator)
is appended to each JSON chunk string. See below for an example of how two chunks received at the same time look like as a string.
{
"chunk":{
"type":"partial_answer",
"content":{
"text":" one"
},
"stream_end": false
}
}
{
"chunk":{
"type":"partial_answer",
"content":{
"text":" two"
},
"stream_end": false
}
}
3. Thread (Non-Streamed)
Generate an answer for a query as part of an existing conversational thread. The entire conversation reply is generated before being sent back in a single response.
API Route
GET /query/v1/thread/\<THREAD_ID>
Example Usage
curl -X GET \\
'\<KAPA_API_ENDPOINT>/query/v1/thread/{THREAD_ID}?query=What+was+my+first+query' \\
-H 'X-API-TOKEN: \<KAPA_API_TOKEN>'
Request Parameters
Parameter | Type | Description |
---|---|---|
thread_id | string | The unique ID of the answer thread from a prior query |
query | string | The follow-up question based on the previous context |
Response
The response for the Thread Endpoint will also be a JSON object containing the following fields:
{
"answer": "Your first query was 'How do I get started?'",
"thread_id": "abd4eef3-46e4-46d6-956a-accbed07fa7c",
"question_answer_id": "8fd4c69f-7004-4565-a394-f8e2cce53b7c"
}
Field | Type | Description |
---|---|---|
answer | string | The answer provided by kapa.ai, based on the context of the previous question |
thread_id | string | The unique ID of the answer thread |
question_answer_id | string | The unique ID of the question-answer pair |
4. Thread (Streamed)
Generate an answer for a query as part of an existing conversational thread. The streamed response allows the server to send partial responses to the client as soon as they are available (e.g., in "ChatGPT-style typing mode"). This approach is useful for providing the client with relevant sources and the beginning of the completion before the entire completion is finished. The response is sent in chunks over a single connection, allowing for a more interactive and dynamic user experience.
API Route
GET /query/v1/thread/\<THREAD_ID>/stream
Example Usage
curl -X GET \\
'\<KAPA_API_ENDPOINT>/query/v1/thread/{THREAD_ID}/stream?query=What+was+my+first+query' \\
-H 'X-API-TOKEN: \<KAPA_API_TOKEN>'
Request Parameters
Parameter | Type | Description |
---|---|---|
thread_id | string | The unique ID of the answer thread from a prior query |
query | string | The follow-up question based on the previous context |
Response
The response returned is identical to the response returned by 2. Query (Streamed).