Cortex Analyst REST API¶
Use this API to answer questions about your data with natural language queries.
Send message¶
POST /api/v2/cortex/analyst/message
Generates a SQL query for the given question using a semantic model provided in the request. One or more models can be specified; when multiple models are specified, Cortex Analyst chooses the most appropriate one. You can have multi-turn conversations where you can ask follow-up questions that build upon previous queries. For more information, see Multi-turn conversation in Cortex Analyst.
The request includes a user question; the response includes the user question and the analyst response. Each message in a response
can have multiple content blocks of different types. Three values that are currently supported for the type
field of the content
object are: text
, suggestions
, and sql
.
Responses can be sent all at once after processing is complete, or incrementally as they are generated.
Request Headers¶
Header |
Description |
---|---|
|
(Required) Authorization token. For more information, see Authenticating to the server. |
|
Authorization token type. Defaults to OAuth; required if using any other type of token. For more information, see Authenticating to the server. |
|
(Required) application/json |
Request Body¶
The request body contains the role of the speaker which must be user
, the user’s question and a path to the semantic model YAML file.
The user question is contained in a content
object which has two fields, type
and text
. text
is also the only allowed
value of the field type
.
Field |
Description |
---|---|
|
(Required) The role of the entity that is creating the message. Currently only supports Type: string:enum Example: |
|
(Required) The content object that is part of a message. Type: object
|
|
(Required) The content type. Currently only Type: string:enum Example: |
|
(Required) The user’s question. Type: string Example: |
|
Path to the semantic model YAML file. Must be a fully qualified stage URL including the database and schema.
You can instead provide the complete semantic model YAML in the Type: string Example: |
|
A string containing the entire semantic model YAML. You can instead pass the semantic model YAML as a staged file
by providing its URL in the Type: string |
|
(Optional) If set to Type: boolean |
Important
You must provide either semantic_model_file
OR semantic_model
in the request body.
Example¶
{
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "which company had the most revenue?"
}
]
}
],
"semantic_model_file": "@my_stage/my_semantic_model.yaml"
}
Non-streaming response¶
This operation can return the response codes listed below.
The response always has the following structure. Currently, three content types are supported for the
response, text
, suggestion
, and sql
. The content types suggestion
and sql
are mutually exclusive so that if the
response contains a sql
content type, it won’t contain a suggestion
content type, and vice versa. The suggestion
content type is only included
in a response if the user question was ambiguous and Cortex Analyst could not return a SQL statement for that query.
To ensure forward compatibility, make sure your implementation takes the content type into account and handles types.
Code |
Description |
---|---|
200 |
The statement was executed successfully. The body of the response contains a message object that contains the following fields:
|
By default, the response is returned all at once after Cortex Analyst has fully processed the user’s question. See Streaming response for the format of streaming mode responses.
{ "request_id": "75d343ee-699c-483f-83a1-e314609fb563", "message": { "role": "analyst", "content": [ { "type": "text", "text": "We interpreted your question as ..." }, { "type": "sql", "statement": "SELECT * FROM table", "confidence": { "verified_query_used": { "name": "My verified query", "question": "What was the total revenue?", "sql": "SELECT * FROM table2", "verified_at": 1714497970, "verified_by": "Jane Doe" } } } ] }, "warnings": [ { "message": "Table table1 has (30) columns, which exceeds the recommended maximum of 10" }, { "message": "Table table2 has (40) columns, which exceeds the recommended maximum of 10" } ] }
Streaming response¶
Streaming mode lets your client receive responses as they are generated by Cortex Analyst, rather than waiting for the entire response to be generated. This improves the perceived responsiveness of your application, especially for long-running queries, because users begin seeing output much sooner. Streaming responses also provide status information that can help you understand where Cortex Analyst is in the process of generating a response, and warnings that can help understand what went wrong when Cortex Analyst doesn’t work as you expected.
To receive a streaming response, set the stream
field in the request body to true
.
Streaming responses use server-sent events.
Cortex Analyst sends five distinct types of events in a streaming response:
status
: Conveys status updates about the SQL generation process.message.content.delta
: Contains a piece of the response. This event is sent multiple times.error
: Indicates that Cortex Analyst has encountered an error and cannot continue processing the request. No furthermessage.content.delta
events will be sent.warnings
: Sent at the end of a response to convey any warnings encountered during processing. Warnings do not stop processing.done
: Sent to indicate that processing is complete and no furthermessage.content.delta
events will be sent.
Of these, the message.content.delta
events are the most crucial to understand, because they contain the actual
response content. Each delta
contains tokens from some field in the complete response. It is possible for each
delta
event to contain anywhere between a single character to the full response, and they may be of different lengths. You receive these tokens as they
are generated; it is up to you to assemble them into the final response.
Important
Events from different responses (even extremely similar ones) can vary. There is no guarantee that events will be sent in the same order or with the same content.
Simple example¶
The following is a sample non-streaming response for a simple query:
{
"message": {
"role": "analyst",
"content": [
{
"type": "text",
"text": "This is how we interpreted your question and this is how the sql is generated"
},
{
"type": "sql",
"statement": "SELECT * FROM table"
}
]
}
}
And this is one possible series of streaming events for that response (a different series of events is also possible):
event: status
data: { status: "interpreting_question" }
event: message.content.delta
data: {
index: 0,
type: "text",
text_delta: "This is how we interpreted your question"
}
event: status
data: { status: "generating_sql" }
event: status
data: { status: "validating_sql" }
event: message.content.delta
data: {
index: 0,
type: "text",
text_delta: " and this is how the sql is generated"
}
event: message.content.delta
data: {
index: 1,
type: "sql",
statement_delta: "SELECT * FROM table"
}
event: status
data: { status: "done" }
Use the index
field in the message.content.delta
respnoses to determine which field in the full response the event is part of.
For example, here the first two delta
events use index 0, which means they are part of the first field (element 0) in the content
array
of the non-streaming response. Similarly, the delta
event that contains the SQL response uses index 1.
Example with suggestions¶
This example contains suggested questions for an ambiguous question. The following is the non-streaming response:
{
"message": {
"role": "analyst",
"content": [
{
"type": "text",
"text": "Your question is ambigous, here are some alternatives:"
},
{
"type": "suggestions",
"suggestions": [
"which company had the most revenue?",
"which company placed the most orders?"
]
}
]
}
}
And here is a possible series of streaming events that constitute that response:
event: status
data: { status: "interpreting_question" }
event: message.content.delta
data: {
index: 0,
type: "text",
text_delta: "Your question is ambigous,"
}
event: status
data: { status: "generating_suggestions" }
event: message.content.delta
data: {
index: 0,
type: "text",
text_delta: " here are some alternatives:"
}
event: message.content.delta
data: {
index: 1,
type: "suggestions",
suggestions_delta: {
index: 0,
suggestion_delta: "which company had",
}
}
event: message.content.delta
data: {
index: 1,
type: "suggestions",
suggestions_delta: {
index: 0,
suggestion_delta: " the most revenue?",
}
}
event: message.content.delta
data: {
index: 1,
type: "suggestions",
suggestions_delta: {
index: 1,
suggestion_delta: "which company placed",
}
}
event: message.content.delta
data: {
index: 1,
type: "suggestions",
suggestions_delta: {
index: 1,
suggestion_delta: " the most orders?",
}
}
event: status
data: { status: "done" }
In this example, the content
field of the non-streaming response is an array. One of the elements of content
is the suggestions
array.
So the meaning of index
fields for text
and suggestions
delta events refer to the location of elements in these two different arrays.
You will need to keep track of these indexes separately when assembling the full response.
Note
Currently, the generated SQL statement is always sent in a single event. This may not be the case in the future. Your client must be prepared to receive the SQL statement in multiple events.
Other examples¶
You can find a Streamlit streaming client for Cortex Analyst in the Cortex Analyst
GitHub repo <https://github.com/Snowflake-Labs/sfguide-getting-started-with-cortex-analyst/blob/main/cortex_analyst_streaming_demo.py>
.
This demo must be run locally; SiS does not currently support streaming.
See the Cortex Analyst playground in the AI/ML Studio (in Snowsight) for an interactive demonstration of streaming response.
Streaming event schemas¶
The following are the OpenAPI/Swagger schemas of the events sent by Cortex Analyst in a streaming response.
- status
- message.content.delta
- error
StreamingError: type: object properties: message: type: string description: A description of the error code: type: string description: The Snowflake error code categorizing the error request_id: type: string description: Unique request ID
- warnings
Warnings: type: object description: Warnings found while processing the request properties: warnings: type: array items: $ref: "#/components/schemas/Warning" Warning: type: object title: The warning object description: Represents a warning within a chat. properties: message: type: string description: A human-readable message describing the warning
Send feedback¶
POST /api/v2/cortex/analyst/feedback
Provides qualitative end user feedback. Within Snowsight, the feedback is shown in Semantic Model Admins under Monitoring.
Request Headers¶
Header |
Description |
---|---|
|
(Required) Authorization token. For more information, see Authenticating to the server. |
|
(Required) application/json |
Request Body¶
Field |
Description |
---|---|
|
(Required) The id of the request that you’ve made to send a message.
Returned in the Type: string Example: |
|
(Required) Whether the feedback is positive or negative.
Type: boolean Example:
|
|
(Optional) The feedback message from the user. Example: |
Response¶
Empty response body with status code 200.
Access control requirements¶
For information on the required privileges, see Access control requirements.
For details about authenticating to the API, see Authenticating Snowflake REST APIs with Snowflake.