Custom Chat API

In addition to the Conversation API kapa offers an API which gives the user control over the prompting used to generate text. Users can leverage this functionality to build applications that lie outside the standard kapa behavior. With the Custom Chat API you can leverage the same data sources that are used by the Conversation API but you have full control over the prompting and limited control over retrieval.

API Route

POST /query/v1/chat/custom

Request Body

The following request body parameters can be submitted:

Parameter	Type	Description	Default
messages	Message[]	List of chat messages	-
persist_answer	boolean	Whether to persist the 'query' and generate 'answer'	true
use_retrieval	boolean	Whether to use retrieval for the generation	true
retrieval_query	string	(optional) Query used for retrieval instead of 'query'	-
generation_model	string	Model used for generation, available options are (`gpt-4`,`gpt-4-0613`, `gpt-4-turbo-preview`, `gpt-4-0125-preview`, `gpt-3.5-turbo-16k`) see OpenAI models	-

`Message` Type

The messages submitted to the Custom Chat API represent the prompting. Message objects have the following structure:

Parameter	Type	Description
role	string	available roles are `system`, `user`, `query`, `assistant` and `context`
content	string	The content of the message

The message roles system, user and assistant correspond to the messages types of the OpenAI chat interface. The roles query and context are extension introduced by the kapa system.

system: There can be only one system message per prompt. It is used to set the behavior of the assistant at the start of the conversation.
user: User messages represent the input from the user to the AI. You should write your instructions as user messages.
assistant: Assistant messages represent responses generated by the AI.
query: There can be only one query message. The query message is used for retrieval if no retrieval_query is given and persisted along the answer if persist_answer is true. It is treated as a user message when sent to GPT-4.
context: Ther can be only one context message. The context message is a placeholder for the retrieval context to be inserted.

Example Request Body

{
  "persist_answer": false,
  "use_retrieval:": true,
  "retrieval_query": "What are the most recent blog articles for our database?",
  "messages": [
    {
      "role": "system",
      "content": "You are a smart sales person for a database"
    },
    {
      "role": "user",
      "content": "You are given a few recent blog articles. Please use them to write an outbound email template targeted at CTOs."
    },
    {
      "role": "context"
    },
    {
      "role": "user",
      "content": "Sales Template:"
    }
  ]
}

What happens when I submit a Request?

When a user submits a request kapa will perform the following steps:

kapa performs semantic search over your knowledge sources using the content of the query message. If a retrieval_query is specified it is used instead.
kapa replaces the context message, with user messages containing the relevant context it found during retrieval.
The query message is converted into a user message.
All messages are sent to Openai.
If persist_answer is true the query message is persisted for analytics along the generated answer.

Response Body

The following response body is returned for each request.

Parameter	Type	Description
answer	string	The generated answer
thread_id	string	The id of the created thread which the query is part of
question_answer_id	string	The id of the created query answer pair
messages	Message[]	The final messages messages sent to Openai after retrieval

Custom Chat API

API Route​

Request Body​

Message Type​

Example Request Body​

What happens when I submit a Request?​

Response Body​