Chat Inference

Simulates a chat interaction with an AI assistant, optionally incorporating search results and external context.

Endpoint

POST /infer

Request Body

Field	Type	Description
infer_type	string	Must be set to "chat" for chat inference.
infer_params	object	Parameters for the chat inference.
stream_type	string	Specifies the streaming behavior. Options: "disabled", "enabled", "per_value". Default is "enabled".

`infer_params` object

Field	Type	Description
chat_session_id	string	Unique identifier for the chat session. Helps Keeping track of conversation history
surf_id	string	(Optional) Unique identifier for the user. Used to provide additional context from user history across the site. Fetched from the CDP.
user_message	string	The user's input message.
search	object	(Optional) Search parameters to provide context.
external_context	object	(Optional) Any external context to be considered in the chat.
specialised_model	string	(Optional) The name of a specialised model to use for inference.

`search` object (Optional)

If provided, this object should follow the structure of the search request as described in the Search Documents API. It includes:

Field	Type	Description
query	string (optional)	The search query (typically the user_message unless specified).
search_type	string	Type of search to perform.
search_space	string	Defines the search space.
search_params	object (optional)	Additional search parameters.
additional_filters	array of objects (optional)	Additional search filters.
page	object	Pagination information.

Refer to the Search Documents API for more details on the search object structure. Any updates on the Search Documents API will be reflected here as well because in the backend it uses the same search service.

Usage Examples

Basic Chat Inference

{
  "infer_type": "chat",
  "infer_params": {
    "chat_session_id": "session_123",
    "user_message": "What are some applications of artificial intelligence?"
  },
  "stream_type": "enabled"
}

A basic chat inference request with no additional context.

Chat Inference with Global Site Search for Context

{
  "infer_type": "chat",
  "infer_params": {
    "chat_session_id": "session_456",
    "user_message": "What are some applications of artificial intelligence?",
    "search": {
      "search_space": "global",
    }
  },
  "stream_type": "enabled"
}

A chat that picks up context from a global site search.

Chat Inference with Specific Document for Context

{
  "infer_type": "chat",
  "infer_params": {
    "chat_session_id": "session_456",
    "user_message": "What are some applications of artificial intelligence?",
    "search": {
      "search_params": {
        "doc_id": "doc_12345"
      }
    }
  },
  "stream_type": "enabled"
}

A chat that picks up context from a specific document. Used when the user is asking a follow-up question about a document.

Chat Inference with External Context

{
  "infer_type": "chat",
  "infer_params": {
    "chat_session_id": "session_789",
    "user_message": "Can you summarize the key points?",
    "external_context": ["The discussion is about renewable energy sources and their impact on reducing carbon emissions."]
  },
  "stream_type": "enabled"
}

A chat that incorporates external context into the conversation to provide more relevant responses.

Chat Inference with User History, Specialised Model, and Specific Page

{
  "infer_type": "chat",
  "infer_params": {
    "chat_session_id": "session_789",
    "user_message": "Can you summarize the key points?",
    "surf_id": "user_123",
    "specialised_model": "user_history_digest_v1",
    "search": {
      "search_params": {
        "doc_id": "doc_12345"
      }
    }
  },
  "stream_type": "enabled"
}

A chat that uses a specialised model for ingesting user history & incorporates context from a specific document. Helpful in case you have amanaged CDP attached to the service and you want to use the user's history to provide more relevant responses.

Response

The response is streamed back to the client based on the stream_type specified in the request.

Stream Types

Stream Type	Description	Use Case
disabled	Returns the entire response at once	When the client can process the entire response in one go.
enabled	Streams the response in chunks	When the response is too large to be processed at once.
per_value	Streams each value separately	When the client needs to process each value individually. In case of smart chips.

Response Structure

Field	Type	Description
key	string	The key for the streamed data (e.g., "assistant_response").
value	string	The content of the response.
stream_status	string	Indicates the status of the stream. Can be "stream_running" or "stream_over".

Response - Key Examples

Key	Type	Description
assistant_response	string	The response from the AI assistant.
reffered_context	object	The context referred to in the response along with relevancy scores

Anything other than the assistant_response key will be streamed in the end of the response.

Sample reffered_context object:

{
  "reffered_context": [
    {
      "doc_id": "doc_12345",
      "chunk": "chunk_text",
      "relevancy_score": 0.8
    },
    {
      "doc_id": "doc_67890",
      "chunk": "chunk_text",
      "relevancy_score": 0.0
    }
  ]
}

A relevancy_score of 0.0 indicates that the context was fetched but not used in the response. Helpful for debugging and understanding the context used in the response.

Error Responses

400 Bad Request: Invalid input data
- Examples:
  - Missing required parameters
  - Invalid infer_type
  - Invalid stream_type

Note: The chat inference API simulates an AI assistant's response. In a production environment, it would integrate with a more sophisticated language model and knowledge base.