Chat Inference
Simulates a chat interaction with an AI assistant, optionally incorporating search results and external context.
Endpoint
Request Body
Field | Type | Description |
---|---|---|
infer_type | string | Must be set to "chat" for chat inference. |
infer_params | object | Parameters for the chat inference. |
stream_type | string | Specifies the streaming behavior. Options: "disabled", "enabled", "per_value". Default is "enabled". |
infer_params
object
Field | Type | Description |
---|---|---|
chat_session_id | string | Unique identifier for the chat session. Helps Keeping track of conversation history |
surf_id | string | (Optional) Unique identifier for the user. Used to provide additional context from user history across the site. Fetched from the CDP. |
user_message | string | The user's input message. |
search | object | (Optional) Search parameters to provide context. |
external_context | object | (Optional) Any external context to be considered in the chat. |
specialised_model | string | (Optional) The name of a specialised model to use for inference. |
search
object (Optional)
If provided, this object should follow the structure of the search request as described in the Search Documents API. It includes:
Field | Type | Description |
---|---|---|
query | string (optional) | The search query (typically the user_message unless specified). |
search_type | string | Type of search to perform. |
search_space | string | Defines the search space. |
search_params | object (optional) | Additional search parameters. |
additional_filters | array of objects (optional) | Additional search filters. |
page | object | Pagination information. |
Refer to the Search Documents API for more details on the search object structure. Any updates on the Search Documents API will be reflected here as well because in the backend it uses the same search service.
Usage Examples
Basic Chat Inference
{
"infer_type": "chat",
"infer_params": {
"chat_session_id": "session_123",
"user_message": "What are some applications of artificial intelligence?"
},
"stream_type": "enabled"
}
Chat Inference with Global Site Search for Context
{
"infer_type": "chat",
"infer_params": {
"chat_session_id": "session_456",
"user_message": "What are some applications of artificial intelligence?",
"search": {
"search_space": "global",
}
},
"stream_type": "enabled"
}
Chat Inference with Specific Document for Context
{
"infer_type": "chat",
"infer_params": {
"chat_session_id": "session_456",
"user_message": "What are some applications of artificial intelligence?",
"search": {
"search_params": {
"doc_id": "doc_12345"
}
}
},
"stream_type": "enabled"
}
Chat Inference with External Context
{
"infer_type": "chat",
"infer_params": {
"chat_session_id": "session_789",
"user_message": "Can you summarize the key points?",
"external_context": ["The discussion is about renewable energy sources and their impact on reducing carbon emissions."]
},
"stream_type": "enabled"
}
Chat Inference with User History, Specialised Model, and Specific Page
{
"infer_type": "chat",
"infer_params": {
"chat_session_id": "session_789",
"user_message": "Can you summarize the key points?",
"surf_id": "user_123",
"specialised_model": "user_history_digest_v1",
"search": {
"search_params": {
"doc_id": "doc_12345"
}
}
},
"stream_type": "enabled"
}
Response
The response is streamed back to the client based on the stream_type
specified in the request.
Stream Types
Stream Type | Description | Use Case |
---|---|---|
disabled | Returns the entire response at once | When the client can process the entire response in one go. |
enabled | Streams the response in chunks | When the response is too large to be processed at once. |
per_value | Streams each value separately | When the client needs to process each value individually. In case of smart chips. |
Response Structure
Field | Type | Description |
---|---|---|
key | string | The key for the streamed data (e.g., "assistant_response"). |
value | string | The content of the response. |
stream_status | string | Indicates the status of the stream. Can be "stream_running" or "stream_over". |
Response - Key Examples
Key | Type | Description |
---|---|---|
assistant_response | string | The response from the AI assistant. |
reffered_context | object | The context referred to in the response along with relevancy scores |
Anything other than the assistant_response
key will be streamed in the end of the response.
Sample reffered_context
object:
{
"reffered_context": [
{
"doc_id": "doc_12345",
"chunk": "chunk_text",
"relevancy_score": 0.8
},
{
"doc_id": "doc_67890",
"chunk": "chunk_text",
"relevancy_score": 0.0
}
]
}
relevancy_score
of 0.0 indicates that the context was fetched but not used in the response. Helpful for debugging and understanding the context used in the response.
Error Responses
400 Bad Request
: Invalid input data- Examples:
- Missing required parameters
- Invalid infer_type
- Invalid stream_type
- Examples:
Note: The chat inference API simulates an AI assistant's response. In a production environment, it would integrate with a more sophisticated language model and knowledge base.