Skip to content

Chat Inference

Simulates a chat interaction with an AI assistant, optionally incorporating search results and external context.

Endpoint

POST /infer

Request Body

Field Type Description
infer_type string Must be set to "chat" for chat inference.
infer_params object Parameters for the chat inference.
stream_type string Specifies the streaming behavior. Options: "disabled", "enabled", "per_value". Default is "enabled".

infer_params object

Field Type Description
chat_session_id string Unique identifier for the chat session. Helps Keeping track of conversation history
surf_id string (Optional) Unique identifier for the user. Used to provide additional context from user history across the site. Fetched from the CDP.
user_message string The user's input message.
search object (Optional) Search parameters to provide context.
external_context object (Optional) Any external context to be considered in the chat.
specialised_model string (Optional) The name of a specialised model to use for inference.

search object (Optional)

If provided, this object should follow the structure of the search request as described in the Search Documents API. It includes:

Field Type Description
query string (optional) The search query (typically the user_message unless specified).
search_type string Type of search to perform.
search_space string Defines the search space.
search_params object (optional) Additional search parameters.
additional_filters array of objects (optional) Additional search filters.
page object Pagination information.

Refer to the Search Documents API for more details on the search object structure. Any updates on the Search Documents API will be reflected here as well because in the backend it uses the same search service.

Usage Examples

Basic Chat Inference

{
  "infer_type": "chat",
  "infer_params": {
    "chat_session_id": "session_123",
    "user_message": "What are some applications of artificial intelligence?"
  },
  "stream_type": "enabled"
}
A basic chat inference request with no additional context.

Chat Inference with Global Site Search for Context

{
  "infer_type": "chat",
  "infer_params": {
    "chat_session_id": "session_456",
    "user_message": "What are some applications of artificial intelligence?",
    "search": {
      "search_space": "global",
    }
  },
  "stream_type": "enabled"
}
A chat that picks up context from a global site search.

Chat Inference with Specific Document for Context

{
  "infer_type": "chat",
  "infer_params": {
    "chat_session_id": "session_456",
    "user_message": "What are some applications of artificial intelligence?",
    "search": {
      "search_params": {
        "doc_id": "doc_12345"
      }
    }
  },
  "stream_type": "enabled"
}
A chat that picks up context from a specific document. Used when the user is asking a follow-up question about a document.

Chat Inference with External Context

{
  "infer_type": "chat",
  "infer_params": {
    "chat_session_id": "session_789",
    "user_message": "Can you summarize the key points?",
    "external_context": ["The discussion is about renewable energy sources and their impact on reducing carbon emissions."]
  },
  "stream_type": "enabled"
}
A chat that incorporates external context into the conversation to provide more relevant responses.

Chat Inference with User History, Specialised Model, and Specific Page

{
  "infer_type": "chat",
  "infer_params": {
    "chat_session_id": "session_789",
    "user_message": "Can you summarize the key points?",
    "surf_id": "user_123",
    "specialised_model": "user_history_digest_v1",
    "search": {
      "search_params": {
        "doc_id": "doc_12345"
      }
    }
  },
  "stream_type": "enabled"
}
A chat that uses a specialised model for ingesting user history & incorporates context from a specific document. Helpful in case you have amanaged CDP attached to the service and you want to use the user's history to provide more relevant responses.

Response

The response is streamed back to the client based on the stream_type specified in the request.

Stream Types

Stream Type Description Use Case
disabled Returns the entire response at once When the client can process the entire response in one go.
enabled Streams the response in chunks When the response is too large to be processed at once.
per_value Streams each value separately When the client needs to process each value individually. In case of smart chips.

Response Structure

Field Type Description
key string The key for the streamed data (e.g., "assistant_response").
value string The content of the response.
stream_status string Indicates the status of the stream. Can be "stream_running" or "stream_over".

Response - Key Examples

Key Type Description
assistant_response string The response from the AI assistant.
reffered_context object The context referred to in the response along with relevancy scores

Anything other than the assistant_response key will be streamed in the end of the response.

Sample reffered_context object:

{
  "reffered_context": [
    {
      "doc_id": "doc_12345",
      "chunk": "chunk_text",
      "relevancy_score": 0.8
    },
    {
      "doc_id": "doc_67890",
      "chunk": "chunk_text",
      "relevancy_score": 0.0
    }
  ]
}
A relevancy_score of 0.0 indicates that the context was fetched but not used in the response. Helpful for debugging and understanding the context used in the response.

Error Responses

  • 400 Bad Request: Invalid input data
    • Examples:
      • Missing required parameters
      • Invalid infer_type
      • Invalid stream_type

Note: The chat inference API simulates an AI assistant's response. In a production environment, it would integrate with a more sophisticated language model and knowledge base.