Skip to main content

Custom Chat API

In addition to the Conversation API kapa offers an API which gives the user control over the prompting used to generate text. Users can leverage this functionality to build applications that lie outside the standard kapa behavior. With the Custom Chat API you can leverage the same data sources that are used by the Conversation API but you have full control over the prompting and limited control over retrieval.

API Route

POST /query/v1/chat/custom

Request Body

The following request body parameters can be submitted:

messagesMessage[]List of chat messages-
persist_answerbooleanWhether to persist the 'query' and generate 'answer'true
use_retrievalbooleanWhether to use retrieval for the generationtrue
retrieval_querystring(optional) Query used for retrieval instead of 'query'-
generation_modelstringModel used for generation, available options are (gpt-4,gpt-4-0613, gpt-4-turbo-preview, gpt-4-0125-preview) see OpenAI models-

Message Type

The messages submitted to the Custom Chat API represent the prompting. Message objects have the following structure:

rolestringavailable roles are system, user, query, assistant and context
contentstringThe content of the message

The message roles system, user and assistant correspond to the messages types of the OpenAI chat interface. The roles query and context are extension introduced by the kapa system.

  • system: There can be only one system message per prompt. It is used to set the behavior of the assistant at the start of the conversation.
  • user: User messages represent the input from the user to the AI. You should write your instructions as user messages.
  • assistant: Assistant messages represent responses generated by the AI.
  • query: There can be only one query message. The query message is used for retrieval if no retrieval_query is given and persisted along the answer if persist_answer is true. It is treated as a user message when sent to GPT-4.
  • context: Ther can be only one context message. The context message is a placeholder for the retrieval context to be inserted.

Example Request Body

"persist_answer": true,
"use_retrieval:": true,
"retrieval_query": "What are the most recent blog articles for our database?",
"messages": [
"role": "system",
"content": "You are a smart sales person for a database",
"role": "user",
"content": "You are given a few recent blog articles. Please use them to write an outbound email template targeted at CTOs.",
"role": "context",
"role": "user",
"content": "Sales Template:"

What happens when I submit a Request?

When a user submits a request kapa will perform the following steps:

  1. kapa performs semantic search over your knowledge sources using the content of the query message. If a retrieval_query is specified it is used instead.
  2. kapa replaces the context message, with user messages containing the relevant context it found during retrieval.
  3. The query message is converted into a user message.
  4. All messages are sent to Openai.
  5. If persist_answer is true the query message is persisted for analytics along the generated answer.

Response Body

The following response body is returned for each request.

answerMessage[]The generated answer
thread_idbooleanThe id of the created thread which the query is part of
question_answer_idbooleanThe id of the created query answer pair
messagesMessage[]The final messages messages sent to Openai after retrieval