Cortex Analyst¶
Overview¶
Cortex Analyst is a fully-managed, LLM-powered Snowflake Cortex feature that helps you create applications capable of reliably answering business questions based on your structured data in Snowflake. With Cortex Analyst, business users can ask questions in natural language and receive direct answers without writing SQL. Available as a convenient REST API, Cortex Analyst can be seamlessly integrated into any application.
Building a production-grade conversational self-service analytics solution requires a service that generates accurate text-to-SQL responses. For most teams, developing such a service that successfully balances accuracy, latency, and costs is a daunting task. Cortex Analyst simplifies this process by providing a fully managed, sophisticated agentic AI system that handles all of these complexities, generating highly accurate text-to-SQL responses. It helps you accelerate the delivery of high-precision, self-serve conversational analytics to business teams, while avoiding time sinks such as complex RAG solution patterns, model experimentation, and GPU capacity planning. The generated SQL queries are executed against the scalable Snowflake engine, ensuring industry-leading price performance and lower total cost of ownership (TCO).
Tip
Want to get started with Cortex Analyst quickly? Try the Tutorial: Answer questions about time-series revenue data with Cortex Analyst tutorial.
Key Features¶
Self-serve analytics via natural language queries. Delight your business teams and non-technical users with instant answers and insights from their structured data in Snowflake. Using Cortex Analyst, you can build downstream chat applications that allow your users to ask questions using natural language and receive accurate answers on the fly.
Convenient REST API for integration into existing business workflows. Cortex Analyst takes an API-first approach, giving you full control over the end user experience. Easily integrate Cortex Analyst into existing business tools and platforms, bringing the power of data insights to where business users already operate, such as Streamlit apps, Slack, Teams, custom chat interfaces, and more.
Powered by state-of-the-art large language models: By default, Cortex Analyst is powered by the latest Meta Llama and Mistral models, which run securely inside Snowflake Cortex, Snowflake’s intelligent, fully managed AI service. Optionally, you can also give Cortex Analyst access to the latest Azure-hosted OpenAI GPT models. At runtime, Cortex Analyst selects the best combination of models to ensure the highest accuracy and performance for each query. For details, see Enabling use of Azure OpenAI models. As LLMs evolve, Snowflake will continue to explore adding more models to the mix to further improve performance and accuracy.
Semantic model for high precision and accuracy: Generic AI solutions often struggle with text-to-SQL conversions when given only a database schema, as schemas lack critical knowledge like business process definitions and metrics handling. Cortex Analyst overcomes this limitation by using a semantic model that bridges the gap between business users and databases. Captured in a lightweight YAML file, the overall structure and concepts of the semantic model are similar to those of database schemas, but allow for a richer description of the semantic information around the data.
Security and governance. Snowflake’s privacy-first foundation and enterprise-grade security ensure that you can explore AI-driven use cases with confidence, knowing your data is protected by the highest standards of privacy and governance.
Cortex Analyst does not train on Customer Data. We do not use your Customer Data to train or fine-tune any Model to be made available for use across our customer base. Additionally, for inference, Cortex Analyst utilizes the metadata provided in the semantic model YAML file (e.g., table names, column names, value type, descriptions, etc.) only for SQL-query generation. This SQL query is then executed in your Snowflake virtual warehouse to generate the final output.
Data stays within Snowflake’s governance boundary. By default, Cortex Analyst is powered by Snowflake-hosted LLMs from Mistral and Meta, ensuring that no data, including metadata or prompts, leaves Snowflake’s governance boundary. If you opt to use Azure OpenAI models, only metadata and prompts are transmitted outside of Snowflake’s governance boundary.
Seamless integration with Snowflake’s Privacy and Governance features. Cortex Analyst fully integrates with Snowflake’s role-based access control (RBAC) policies, ensuring that SQL queries generated and executed adhere to all established access controls. This guarantees robust security and governance for your data.
Access control requirements¶
To make a request to Cortex Analyst, you must use a role that has the SNOWFLAKE.CORTEX_USER role granted.
To use Cortex Analyst with a semantic model, you also need the following privileges:
Privilege |
Object |
---|---|
USAGE |
Stage that contains the semantic model YAML file, if the semantic model is uploaded to a stage. |
USAGE |
The Cortex Search services mentioned in the semantic model. |
SELECT |
The tables mentioned in the semantic model. |
Requests to the Cortex Analyst API must include an authorization token. For details on how to authenticate to the API, see Authenticating Snowflake REST APIs with Snowflake.
Note that the example in this topic uses a session token to authenticate to a Snowflake account.
Limiting access to specific roles¶
By default, the CORTEX_USER role is granted to the PUBLIC role. The PUBLIC role is automatically granted to all users and roles. If you don’t want all users to have this privilege, you can revoke access to the PUBLIC role and grant access to specific roles. For more information, see Required privileges.
To control access to specific semantic models, you can store the YAML file in a stage and control access to that stage.
Region Availability¶
Cortex Analyst is natively available in the following regions.
AWS ap-northeast-1 (Tokyo)
AWS ap-southeast-2 (Sydney)
AWS us-east-1 (Virginia)
AWS us-west-2 (Oregon)
AWS eu-central-1 (Frankfurt)
AWS eu-west-1 (Ireland)
Azure East US 2 (Virginia)
Azure West Europe (Netherlands)
If your Snowflake account is in a different cloud region, you can still use Cortex Analyst by leveraging Cross-region inference. Once cross-region inference is enabled, Cortex Analyst processes requests in other regions for models that are not available in your default region. For optimal performance, configure cross-region with AWS US regions.
Known issues and limitations¶
If you upload a semantic model YAML file to a stage, access to that semantic model is controlled by access to the stage it’s uploaded to. This means that any role with access to the stage can access the semantic models on that stage, even if the role doesn’t have access to the underlying tables.
By default, Cortex Analyst is rate-limited to 20 requests per minute, which should be sufficient for proof of concept. Contact your Sales Engineer to request a higher limit.
Enabling use of Azure OpenAI models¶
By default, Cortex Analyst is powered by Snowflake-hosted Cortex LLMs. You can, however, explicitly opt-in to allow Cortex Analyst to use the latest OpenAI GPT models, hosted by Microsoft Azure, alongside the Snowflake-hosted models. At runtime, Cortex Analyst selects the optimal combination of models to ensure the highest accuracy and performance for each query.
Note
If you opt in to using Azure OpenAI models, Cortex Analyst is available for use in all AWS, Azure, and GCP regions, except for Gov and VPS deployments.
You can configure your account to allow use of the Azure OpenAI GPT models with the ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI parameter. By default, the parameter is disabled and can only be set by the ACCOUNTADMIN role using the ALTER ACCOUNT command:
USE ROLE ACCOUNTADMIN;
ALTER ACCOUNT SET ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI = TRUE;
Tip
To see the current value of this parameter, use the following SQL statement.
SHOW PARAMETERS LIKE 'ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI' IN ACCOUNT
See ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI for more details.
When this parameter is enabled, Cortex Analyst might be powered by any combination of:
Snowflake-hosted models, currently Mistral Large and Llama3 models
Azure OpenAI models, currently GPT-4o (requires explicit opt-in)
Note
Cortex Analyst might use different models in the future to further improve performance and accuracy.
Considerations¶
Semantic model files are classified as metadata. If you opt in to using Azure OpenAI models in Cortex Analyst, your semantic model will be processed by Microsoft Azure, a third party. Customer Data, however, is not shared with or processed by Azure.
ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI¶
The ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI account parameter, if TRUE, allows Cortex Analyst to use Azure OpenAI models.
Parameter Type |
Session |
---|---|
Data Type |
BOOLEAN |
Description |
Controls whether Cortex Analyst can use Azure OpenAI models to process requests. |
Values |
|
Default |
FALSE |
Multi-turn conversation in Cortex Analyst¶
Cortex Analyst supports multi-turn conversations for data-related questions. This feature enables asking follow-up questions that build on previous queries, creating a more dynamic and interactive data exploration experience. For example, the user asks, “What is the month-over-month revenue growth for 2021 in Asia?”, then follows up with, “What about North America?”
Cortex Analyst recognizes the follow-up, retrieves the context from the initial query, and rephrases the second question as: “What is the month-over-month revenue growth for 2021 in North America?” Cortex Analyst then generates a SQL query to answer this question.
To use this feature, pass the conversation history in the messages
field:
{
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is the month over month revenue growth for 2021 in Asia?"
}
]
},
{
"role": "analyst",
"content": [
{
"type": "text",
"text": "We interpreted your question as ..."
},
{
"type": "sql",
"statement": "SELECT * FROM table"
}
]
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "What about North America?"
}
]
},
],
"semantic_model_file": "@my_stage/my_semantic_model.yaml"
}
The conversation history is an array of messages in chronological order, where each message has a role and content. The
role can be "user"
(for previous questions) or "analyst"
(for previous responses). Analyst responses have both text and
SQL responses, as shown in the example above, while user messages have only text.
Important
Like all LLM-powered features, Cortex Analyst does not save state between requests. Instead, it re-processes the entire conversation history at each turn. Therefore, all messages in the request are billable, including historical messages that have already been processed in a previous request.
Known limitations in multi-turn conversations¶
Some of the following limitations might be addressed in future versions of Cortex Analyst.
- Access to the results of previous SQL queries
Cortex Analyst doesn’t have access to results from previous SQL queries. For example, if you first ask, “What are my products?” and then ask, “What is the revenue of the second product?”, Cortex Analyst cannot refer to the list of products from the first query to get the second product.
- General business insights
Cortex Analyst is limited to answering questions that can be resolved with SQL. It does not generate insights for broader business-related queries, such as “What trends do you observe?”
- Long conversations
If a conversation includes too many turns or the user shifts intent frequently, Cortex Analyst might struggle to interpret the follow-up questions. In such cases, reset the conversation and start again.
Getting started options¶
Developers can use the following resources to get started with Cortex Analyst:
Basic code example: The Cortex Analyst example in the following section provides a simple, easy-to-read script that helps you create an interactive app using Cortex Analyst.
Choose this option if you want a basic fundamental example to start with, and are comfortable with using Streamlit and making your own modifications. You can run this example either in Streamlit in Snowflake (SiS) or locally.
Snowflake Samples repository: If you’re instead looking for a more comprehensive implementation, the Cortex Analyst advanced SiS demo in the Snowflake Samples repository has all the features and options already set up. This repository is configured with various pre-built features that make deploying Cortex Analyst seamless and robust.
Choose this option if you are trying to test out the feature for the first time, or have fewer custom modifications to make.
Note
This is shown only as an example. Snowflake does not provide support for the below content, nor does Snowflake warrant that the below content is accurate.
To learn more, see the Cortex Analyst advanced SiS demo in the Snowflake Samples GitHub repository.
Cortex Analyst example¶
Follow these steps to create an interactive Streamlit in Snowflake (SiS) or standalone Streamlit app that uses Cortex Analyst.
Create and run a Streamlit in Snowflake app
Create a semantic model¶
A semantic model is a lightweight mechanism that addresses issues related to the language difference between business users and database definitions by allowing for the specification of additional semantic details about a dataset. These additional semantic details, like more descriptive names or synonyms, enable Cortex Analyst to answer data questions much more reliably.
Start with a list of questions you would like Cortex Analyst to answer. Based on that, decide on the dataset for your semantic model.
Create your semantic model YAML based on the specification. For convenience, try the Semantic model generator. Also, be sure to review the tips for creating a semantic model.
Upload semantic model¶
You can upload a semantic model YAML file to a stage or pass the semantic model YAML as a string in the request body. If you upload a semantic model YAML to a stage, access to that semantic model is controlled by access to the stage it’s uploaded to. This means that any role with access to the stage can access the semantic models on that stage even if the role doesn’t have access to the tables that the models are based on. Ensure that roles granted access to a stage have SELECT access on all tables referenced in all semantic models on that stage.
Below is an example of how to set up the stages containing the semantic models. One stage (public
) is accessible to all members of the
organization, whereas another stage (sales
) is only accessible to the sales_analyst
role.
Create the database and schema for the stage. The following example creates a database named semantic_model
with a schema named
definition
but you can use any valid identifier string for these names.
CREATE DATABASE semantic_model;
CREATE SCHEMA semantic_model.definitions;
GRANT USAGE ON DATABASE semantic_model TO ROLE PUBLIC;
GRANT USAGE ON SCHEMA semantic_model.definitions TO ROLE PUBLIC;
USE SCHEMA semantic_model.definitions;
Then create the stages for storing your semantic models:
CREATE STAGE public DIRECTORY = (ENABLE = TRUE);
GRANT READ ON STAGE public TO ROLE PUBLIC;
CREATE STAGE sales DIRECTORY = (ENABLE = TRUE);
GRANT READ ON STAGE sales TO ROLE sales_analyst;
If using Snowsight, you can refresh the page and find the newly created stages in the database object explorer. You can open the stage page in a new tab and upload your YAML files in Snowsight.
Alternatively, you can use the Snowflake CLI client to upload from your local file system.
snow stage copy file:///path/to/local/file.yaml @sales
Creating a Streamlit in Snowflake App¶
This example shows you how to create a Streamlit in Snowflake app that takes a natural language question as input and calls Cortex Analyst to generate an answer based on the semantic model you provide.
Note
This is shown only as an example. Snowflake does not provide support for the below content, nor does Snowflake warrant that the below content is accurate.
For more information on creating and running Streamlit apps in Snowflake, see About Streamlit in Snowflake.
Follow the directions in Create a Streamlit app by using Snowsight to create a new Streamlit app in Snowsight.
Copy the Streamlit code from our GitHub repo into the code editor.
Replace the placeholder values with your account details.
To preview the app, select Run to update the content in the Streamlit preview pane.
Interact with the Streamlit App¶
Navigate to the Streamlit app in your browser or the Streamlit in Snowflake preview pane.
Start asking questions about your data in natural language (e.g. “What questions can I ask?”).
Create a standalone Streamlit app¶
You can also use the example code to build a standalone app.
Note
This is shown only as an example. Snowflake does not provide support for the below content, nor does Snowflake warrant that the below content is accurate.
Install Streamlit.
Create a Python file locally called
analyst_api.py
.Copy the Streamlit code from our GitHub repo into the file.
Replace the placeholder values with your account details.
Run the Streamlit app using
streamlit run analyst_api.py
.
The database and schema specified in the code is the stage location for the semantic model YAML file. The role used in the Snowflake connector should have access to underlying data defined in semantic model.
For a more comprehensive implementation, see the Cortex Analyst advanced SiS demo in the Snowflake Samples GitHub repository. This repository is configured with various pre-built features that make deploying Cortex Analyst seamless and robust.
Disable Cortex Analyst Functionality¶
If you do not want Cortex Analyst to be available in your account, disable the feature by changing the ENABLE_CORTEX_ANALYST parameter using the ACCOUNTADMIN role:
USE ROLE ACCOUNTADMIN;
ALTER ACCOUNT SET ENABLE_CORTEX_ANALYST = FALSE;
Parameter Type |
Session |
---|---|
Data Type |
BOOLEAN |
Description |
Controls whether Cortex Analyst functionality is enabled in your account. |
Values |
|
Default |
TRUE |
Cost considerations¶
The credit rate usage for Cortex Analyst is based on the number of messages processed as outlined in the Snowflake Service Consumption Table. A single message represents a request and response pair. Note that only successful responses (HTTP 200) are counted.
Note
The above charges cover AI costs for text-to-SQL. Additional warehouse costs apply if you execute the SQL generated by Cortex Analyst.
Legal notices¶
Cortex Analyst is powered by machine learning technology, including Meta’s Llama 3 and Mistral Large models. The foundation Llama 3 models are licensed under the Llama 3 Community License and Copyright (c) Meta Platforms, Inc. All Rights Reserved. Your use of this feature is subject to Meta’s Acceptable Use Policy.
The data classification of inputs and outputs are as set forth in the following table.
Input data classification |
Output data classification |
Designation |
---|---|---|
Usage Data |
Usage Data |
Preview AI Features [1] |
For additional information, refer to Snowflake AI and ML.