Custom Chat API
In addition to the Conversation API kapa offers an API which gives the user control over the prompting used to generate text. Users can leverage this functionality to build applications that lie outside the standard kapa behavior. With the Custom Chat API
you can leverage the same data sources that are used by the Conversation API but you have full control over the prompting and limited control over retrieval.
API Route
POST /query/v1/chat/custom
Request Body
The following request body parameters can be submitted:
Parameter | Type | Description | Default |
---|---|---|---|
messages | Message[] | List of chat messages | - |
persist_answer | boolean | Whether to persist the 'query' and generate 'answer' | true |
use_retrieval | boolean | Whether to use retrieval for the generation | true |
retrieval_query | string | (optional) Query used for retrieval instead of 'query' | - |
generation_model | string | Model used for generation, available options are (gpt-4 ,gpt-4-0613 , gpt-4-turbo-preview , gpt-4-0125-preview , gpt-3.5-turbo-16k ) see OpenAI models | - |
Message
Type
The messages submitted to the Custom Chat API
represent the prompting. Message objects have the following structure:
Parameter | Type | Description |
---|---|---|
role | string | available roles are system , user , query , assistant and context |
content | string | The content of the message |
The message roles system
, user
and assistant
correspond to the messages types of the OpenAI chat interface. The roles query
and context
are extension introduced by the kapa system.
system
: There can be only onesystem
message per prompt. It is used to set the behavior of the assistant at the start of the conversation.user
: User messages represent the input from the user to the AI. You should write your instructions as user messages.assistant
: Assistant messages represent responses generated by the AI.query
: There can be only one query message. The query message is used for retrieval if noretrieval_query
is given and persisted along the answer ifpersist_answer
is true. It is treated as auser
message when sent to GPT-4.context
: Ther can be only one context message. The context message is a placeholder for the retrieval context to be inserted.
Example Request Body
{
"persist_answer": false,
"use_retrieval:": true,
"retrieval_query": "What are the most recent blog articles for our database?",
"messages": [
{
"role": "system",
"content": "You are a smart sales person for a database"
},
{
"role": "user",
"content": "You are given a few recent blog articles. Please use them to write an outbound email template targeted at CTOs."
},
{
"role": "context"
},
{
"role": "user",
"content": "Sales Template:"
}
]
}
What happens when I submit a Request?
When a user submits a request kapa will perform the following steps:
- kapa performs semantic search over your knowledge sources using the content of the
query
message. If aretrieval_query
is specified it is used instead. - kapa replaces the
context
message, withuser
messages containing the relevant context it found during retrieval. - The
query
message is converted into auser
message. - All messages are sent to Openai.
- If
persist_answer
istrue
thequery
message is persisted for analytics along the generated answer.
Response Body
The following response body is returned for each request.
Parameter | Type | Description |
---|---|---|
answer | string | The generated answer |
thread_id | string | The id of the created thread which the query is part of |
question_answer_id | string | The id of the created query answer pair |
messages | Message[] | The final messages messages sent to Openai after retrieval |