Skip to main content

Conversation API

Guide to the Query and Thread endpoint for creating and managing conversations with kapa.ai The Conversation API provides four routes to create and maintain a conversation:

  1. Query (Non-Streamed): To ask a question and get an answer.
  2. Query (Streamed): To ask a question and get a streamed answer.
  3. Thread (Non-Streamed): To ask a follow up question within an existing conversation and get an answer.
  4. Thread (Streamed): To ask a follow up question within an existing conversation and get a streamed response.

Below, you will find detailed information on each API route, including the route, example usage (CURL), request parameters, and response examples.

1. Query (Non-Streamed)

Generate an answer for a query. The entire completion is generated before being sent back in a single response.

API Route

GET /query/v1

Example Usage

curl -X GET \\

'\<KAPA_API_ENDPOINT>/query/v1?query=\<YOUR_QUERY>How+do+I+get+started?' \\

-H 'X-API-TOKEN: \<KAPA_API_TOKEN>'

Request Parameters

ParameterTypeDescription
querystringThe question to ask kapa.ai

Response

The response for the Query Endpoint will be a JSON object containing the following fields:

{
"answer": "To get started, please follow our getting started guide...",

"thread_id": "abd4eef3-46e4-46d6-956a-accbed07fa7c",

"question_answer_id": "ea8cdf61-82c1-4469-95be-ca3fa30ec267"
}
FieldTypeDescription
answerstringThe answer provided by kapa.ai
thread_idstringThe unique ID of the answer thread
question_answer_idstringThe unique ID of the question-answer pair

2. Query (Streamed)

Generate an answer for a query. The streamed response allows the server to send partial responses to the client as soon as they are available (e.g., in "ChatGPT-style typing mode"). This approach is useful for providing the client with relevant sources and the beginning of the completion before the entire completion is finished. The response is sent in chunks over a single connection, allowing for a more interactive and dynamic user experience.

API Route

GET /query/v1/stream

Example Usage

curl -X GET \\

'\<KAPA_API_ENDPOINT>/query/v1/stream?query=\<YOUR_QUERY>How+do+I+get+started?' \\

-H 'X-API-TOKEN: \<KAPA_API_TOKEN>'

Example Usage

See the linked example here of how to create a React-based JS client for consuming the Streamed API.

Example Image

Request Parameters

ParameterTypeDescription
querystringThe question to ask kapa.ai

Response

The response for the Query Streamed Endpoint is sent as a series of chunks using the "chunked" transfer encoding. Each chunk contains a JSON object that may represent a relevant source, a partial answer, identifiers or an error. The "chunked" transfer encoding allows the server to send partial responses to the client as soon as they are available, without specifying the Content-Length header.

  • type: The type of the JSON object in the stream which can be relevant_sources, partial_answer, identifiers, or error

  • content: The content of the JSON object, which depends on type:

    • relevant_sources - a list of source urls
    • partial_answer - partially generated answer to user question
    • identifiers - IDs for the thread_id and question_answer_id
    • error - reason for error during answer generation
  • stream_end: boolean indicator if this is the last chunk of the response

A successful flow of responses contains:

1. A single chunk containing a list of relevant sources

2. Multiple chunks, each containing a partial answer that ultimately "add up" to the whole answer of the user query

3. A single chunk containing the identifier

An unsuccessful flow of responses will include a single chunk of type error, which terminates the response.

Example Response #1: relevant_sources

{
"chunk": {
"type": "relevant_sources",
"content": [
{
"source_url": "docs.example.com"
},
{
"source_url": "docs.other.com"
}
],
"stream_end": false
}
}

Example Response #2: partial_answer

{
"chunk": {
"type": "partial_answer",

"content": {
"text": "To get"
},

"stream_end": false
}
}

Example Response #3: identifiers

{
"chunk": {
"type": "identifiers",

"content": {
"thread_id": "abd4eef3-46e4-46d6-956a-accbed07fa7c",

"question_answer_id": "ea8cdf61-82c1-4469-95be-ca3fa30ec267"
},

"stream_end": true
}
}

Example Response #4: error

{
"chunk": {
"type": "error",

"content": {
"reason": "An error occurred during answer generation."
},

"stream_end": true
}
}

Response Parsing

Most chunks are received separately but this can not not be guaranteed. Each chunk is sent separately by the kapa.ai backend but due to buffering throughout the whole network multiple or partial chunks can be received by your client when consuming the endpoint. To make parsing of this data stream straightforward a non-printable special delimiter U+241E (symbol for record separator) is appended to each JSON chunk string. See below for an example of how two chunks received at the same time look like as a string.

{

"chunk":{
"type":"partial_answer",
"content":{
"text":" one"

},
"stream_end": false
}
}
{
"chunk":{
"type":"partial_answer",
"content":{
"text":" two"

},
"stream_end": false

}
}

3. Thread (Non-Streamed)

Generate an answer for a query as part of an existing conversational thread. The entire conversation reply is generated before being sent back in a single response.

API Route

GET /query/v1/thread/\<THREAD_ID>

Example Usage

curl -X GET \\

'\<KAPA_API_ENDPOINT>/query/v1/thread/{THREAD_ID}?query=What+was+my+first+query' \\

-H 'X-API-TOKEN: \<KAPA_API_TOKEN>'

Request Parameters

ParameterTypeDescription
thread_idstringThe unique ID of the answer thread from a prior query
querystringThe follow-up question based on the previous context

Response

The response for the Thread Endpoint will also be a JSON object containing the following fields:

{
"answer": "Your first query was 'How do I get started?'",

"thread_id": "abd4eef3-46e4-46d6-956a-accbed07fa7c",

"question_answer_id": "8fd4c69f-7004-4565-a394-f8e2cce53b7c"
}
FieldTypeDescription
answerstringThe answer provided by kapa.ai, based on the context of the previous question
thread_idstringThe unique ID of the answer thread
question_answer_idstringThe unique ID of the question-answer pair

4. Thread (Streamed)

Generate an answer for a query as part of an existing conversational thread. The streamed response allows the server to send partial responses to the client as soon as they are available (e.g., in "ChatGPT-style typing mode"). This approach is useful for providing the client with relevant sources and the beginning of the completion before the entire completion is finished. The response is sent in chunks over a single connection, allowing for a more interactive and dynamic user experience.

API Route

GET /query/v1/thread/\<THREAD_ID>/stream

Example Usage

curl -X GET \\

'\<KAPA_API_ENDPOINT>/query/v1/thread/{THREAD_ID}/stream?query=What+was+my+first+query' \\

-H 'X-API-TOKEN: \<KAPA_API_TOKEN>'

Request Parameters

ParameterTypeDescription
thread_idstringThe unique ID of the answer thread from a prior query
querystringThe follow-up question based on the previous context

Response

The response returned is identical to the response returned by 2. Query (Streamed).