Cortex Analyst

Overview

Cortex Analyst is a fully-managed, LLM-powered Snowflake Cortex feature that helps you create applications capable of reliably answering business questions based on your structured data in Snowflake. With Cortex Analyst, business users can ask questions in natural language and receive direct answers without writing SQL. Available as a convenient REST API, Cortex Analyst can be seamlessly integrated into any application.

Building a production-grade conversational self-service analytics solution requires a service that generates accurate text-to-SQL responses. For most teams, developing such a service that successfully balances accuracy, latency, and costs is a daunting task. Cortex Analyst simplifies this process by providing a fully managed, sophisticated agentic AI system that handles all of these complexities, generating highly accurate text-to-SQL responses. It helps you accelerate the delivery of high-precision, self-serve conversational analytics to business teams, while avoiding time sinks such as complex RAG solution patterns, model experimentation, and GPU capacity planning. The generated SQL queries are executed against the scalable Snowflake engine, ensuring industry-leading price performance and lower total cost of ownership (TCO).

Tip

Want to get started with Cortex Analyst quickly? Try the Tutorial: Answer questions about time-series revenue data with Cortex Analyst tutorial.

Key Features

  • Self-serve analytics via natural language queries. Delight your business teams and non-technical users with instant answers and insights from their structured data in Snowflake. Using Cortex Analyst, you can build downstream chat applications that allow your users to ask questions using natural language and receive accurate answers on the fly.

  • Convenient REST API for integration into existing business workflows. Cortex Analyst takes an API-first approach, giving you full control over the end user experience. Easily integrate Cortex Analyst into existing business tools and platforms, bringing the power of data insights to where business users already operate, such as Streamlit apps, Slack, Teams, custom chat interfaces, and more.

  • Powered by state-of-the-art large language models: By default, Cortex Analyst is powered by the latest Meta Llama and Mistral models, which run securely inside Snowflake Cortex, Snowflake’s intelligent, fully managed AI service. Optionally, you can also give Cortex Analyst access to the latest Azure-hosted OpenAI GPT models. At runtime, Cortex Analyst selects the best combination of models to ensure the highest accuracy and performance for each query. For details, see Enabling use of Azure OpenAI models. As LLMs evolve, Snowflake will continue to explore adding more models to the mix to further improve performance and accuracy.

  • Semantic model for high precision and accuracy: Generic AI solutions often struggle with text-to-SQL conversions when given only a database schema, as schemas lack critical knowledge like business process definitions and metrics handling. Cortex Analyst overcomes this limitation by using a semantic model that bridges the gap between business users and databases. Captured in a lightweight YAML file, the overall structure and concepts of the semantic model are similar to those of database schemas, but allow for a richer description of the semantic information around the data.

  • Security and governance. Snowflake’s privacy-first foundation and enterprise-grade security ensure that you can explore AI-driven use cases with confidence, knowing your data is protected by the highest standards of privacy and governance.

    • Cortex Analyst does not train on Customer Data. We do not use your Customer Data to train or fine-tune any Model to be made available for use across our customer base. Additionally, for inference, Cortex Analyst utilizes the metadata provided in the semantic model YAML file (e.g., table names, column names, value type, descriptions, etc.) only for SQL-query generation. This SQL query is then executed in your Snowflake virtual warehouse to generate the final output.

    • Data stays within Snowflake’s governance boundary. By default, Cortex Analyst is powered by Snowflake-hosted LLMs from Mistral and Meta, ensuring that no data, including metadata or prompts, leaves Snowflake’s governance boundary. If you opt to use Azure OpenAI models, only metadata and prompts are transmitted outside of Snowflake’s governance boundary.

    • Seamless integration with Snowflake’s Privacy and Governance features. Cortex Analyst fully integrates with Snowflake’s role-based access control (RBAC) policies, ensuring that SQL queries generated and executed adhere to all established access controls. This guarantees robust security and governance for your data.

Access control requirements

To use Cortex Analyst with a semantic model, you need the following privileges:

Privilege

Object

USAGE

Stage that contains the semantic model YAML file, if the semantic model is uploaded to a stage.

SELECT

The tables defined in the semantic model.

Requests to the Cortex Analyst API must include authentication information. The example in this topic uses a session token to authenticate to a Snowflake account.

Region Availability

Cortex Analyst is natively available in the following regions.

  • AWS ap-northeast-1 (Tokyo)

  • AWS ap-southeast-2 (Sydney)

  • AWS us-east-1 (Virginia)

  • AWS us-west-2 (Oregon)

  • AWS eu-central-1 (Frankfurt)

  • AWS eu-west-1 (Ireland)

  • Azure East US 2 (Virginia)

  • Azure West Europe (Netherlands)

If your Snowflake account is in a different cloud region, you can still use Cortex Analyst by leveraging Cross-region inference. Once cross-region inference is enabled, Cortex Analyst processes requests in other regions for models that are not available in your default region. For optimal performance, configure cross-region with AWS US regions.

Note

If you opt in to using Azure OpenAI models, Cortex Analyst is available for use in all AWS, Azure, and GCP regions, except for Gov and VPS deployments.

Known issues and limitations

  • If you upload a semantic model YAML file to a stage, access to that semantic model is controlled by access to the stage it’s uploaded to. This means that any role with access to the stage can access the semantic models on that stage, even if the role doesn’t have access to the underlying tables.

  • Joins aren’t directly supported in the semantic model. Instead, use views that contain pre-joined data.

  • By default, Cortex Analyst is rate-limited to 20 requests per minute, which should be sufficient for proof of concept. Contact your Sales Engineer to request a higher limit.

Enabling use of Azure OpenAI models

By default, Cortex Analyst is powered by Snowflake-hosted Cortex LLMs. You can, however, explicitly opt-in to allow Cortex Analyst to use the latest OpenAI GPT models, hosted by Microsoft Azure, alongside the Snowflake-hosted models. At runtime, Cortex Analyst selects the optimal combination of models to ensure the highest accuracy and performance for each query.

You can configure your account to allow use of the Azure OpenAI GPT models with the ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI parameter. By default, the parameter is disabled and can only be set by the ACCOUNTADMIN role using the ALTER ACCOUNT command:

USE ROLE ACCOUNTADMIN;

ALTER ACCOUNT SET ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI = TRUE;
Copy

Tip

To see the current value of this parameter, use the following SQL statement.

SHOW PARAMETERS LIKE 'ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI' IN ACCOUNT
Copy

See ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI for more details.

When this parameter is enabled, Cortex Analyst might be powered by any combination of:

  • Snowflake-hosted models, currently Mistral Large and Llama3 models

  • Azure OpenAI models, currently GPT-4o (requires explicit opt-in)

Note

Cortex Analyst might use different models in the future to further improve performance and accuracy.

Considerations

Semantic model files are classified as metadata. If you opt in to using Azure OpenAI models in Cortex Analyst, your semantic model will be processed by Microsoft Azure, a third party. Customer Data, however, is not shared with or processed by Azure.

ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI

The ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI account parameter, if TRUE, allows Cortex Analyst to use Azure OpenAI models.

Parameter Type

Session

Data Type

BOOLEAN

Description

Controls whether Cortex Analyst can use Azure OpenAI models to process requests.

Values

  • FALSE: Cortex Analyst uses only Snowflake-hosted models to process requests.

  • TRUE: Cortex Analyst can use Azure OpenAI models, in addition to Snowflake-hosted models, to process requests.

Default

FALSE

Cortex Analyst example

Follow these steps to create an interactive Streamlit in Snowflake (SiS) or standalone Streamlit app that uses Cortex Analyst.

  1. Create a semantic model

  2. Upload the semantic model to stage

  3. Create and run a Streamlit in Snowflake app

  4. Interact with the Streamlit in Snowflake app

Create a semantic model

A semantic model is a lightweight mechanism that addresses issues related to the language difference between business users and database definitions by allowing for the specification of additional semantic details about a dataset. These additional semantic details, like more descriptive names or synonyms, enable Cortex Analyst to answer data questions much more reliably.

  1. Start with a list of questions you would like Cortex Analyst to answer. Based on that, decide on the dataset for your semantic model.

  2. Create your semantic model YAML based on the specification. For convenience, try the Semantic model generator. Also, be sure to review the tips for creating a semantic model.

Upload semantic model

You can upload a semantic model YAML file to a stage or pass the semantic model YAML as a string in the request body. If you upload a semantic model YAML to a stage, access to that semantic model is controlled by access to the stage it’s uploaded to. This means that any role with access to the stage can access the semantic models on that stage even if the role doesn’t have access to the tables that the models are based on. Ensure that roles granted access to a stage have SELECT access on all tables referenced in all semantic models on that stage.

Below is an example of how to set up the stages containing the semantic models. One stage (public) is accessible to all members of the organization, whereas another stage (sales) is only accessible to the sales_analyst role.

Create the database and schema for the stage. The following example creates a database named semantic_model with a schema named definition but you can use any valid identifier string for these names.

CREATE DATABASE semantic_model;
CREATE SCHEMA semantic_model.definitions;
GRANT USAGE ON DATABASE semantic_model TO ROLE PUBLIC;
GRANT USAGE ON SCHEMA semantic_model.definitions TO ROLE PUBLIC;

USE SCHEMA semantic_model.definitions;
Copy

Then create the stages for storing your semantic models:

CREATE STAGE public DIRECTORY = (ENABLE = TRUE);
GRANT READ ON STAGE public TO ROLE PUBLIC;

CREATE STAGE sales DIRECTORY = (ENABLE = TRUE);
GRANT READ ON STAGE sales TO ROLE sales_analyst;
Copy

If using Snowsight, you can refresh the page and find the newly created stages in the database object explorer. You can open the stage page in a new tab and upload your YAML files in Snowsight.

Alternatively, you can use the Snowflake CLI client to upload from your local file system.

snow stage copy file:///path/to/local/file.yaml @sales
Copy

Creating a Streamlit in Snowflake App

This example shows you how to create a Streamlit in Snowflake app that takes a natural language question as input and calls Cortex Analyst to generate an answer based on the semantic model you provide.

For more information on creating and running Streamlit apps in Snowflake, see About Streamlit in Snowflake.

  1. Follow the directions in Create a Streamlit app by using Snowsight to create a new Streamlit app in Snowsight.

  2. Copy the Streamlit code below into the code editor.

  3. Replace the placeholder values with your account details.

  4. To preview the app, select Run to update the content in the Streamlit preview pane.

from typing import Dict, List, Optional

import _snowflake
import json
import streamlit as st
import time
from snowflake.snowpark.context import get_active_session

DATABASE = "SNOWFLAKE_SEMANTIC_CONTEXT"
SCHEMA = "DEFINITIONS"
STAGE = "PUBLIC"
FILE = "revenue_timeseries.yaml"

def send_message(prompt: str) -> dict:
    """Calls the REST API and returns the response."""
    request_body = {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": prompt
                    }
                ]
            }
        ],
        "semantic_model_file": f"@{DATABASE}.{SCHEMA}.{STAGE}/{FILE}",
    }

    resp = _snowflake.send_snow_api_request(
        "POST",
        f"/api/v2/cortex/analyst/message",
        {},
        {},
        request_body,
        {},
        30000,
    )

    if resp["status"] < 400:
        return json.loads(resp["content"])
    else:
        st.session_state.messages.pop()
        raise Exception(
            f"Failed request with status {resp['status']}: {resp}"
        )

def process_message(prompt: str) -> None:
    """Processes a message and adds the response to the chat."""
    st.session_state.messages.append(
        {"role": "user", "content": [{"type": "text", "text": prompt}]}
    )
    with st.chat_message("user"):
        st.markdown(prompt)
    with st.chat_message("assistant"):
        with st.spinner("Generating response..."):
            response = send_message(prompt=prompt)
            request_id = response["request_id"]
            content = response["message"]["content"]
            st.session_state.messages.append(
                {**response['message'], "request_id": request_id}
            )
            display_content(content=content, request_id=request_id)  # type: ignore[arg-type]


def display_content(
    content: List[Dict[str, str]],
    request_id: Optional[str] = None,
    message_index: Optional[int] = None,
) -> None:
    """Displays a content item for a message."""
    message_index = message_index or len(st.session_state.messages)
    if request_id:
        with st.expander("Request ID", expanded=False):
            st.markdown(request_id)
    for item in content:
        if item["type"] == "text":
            st.markdown(item["text"])
        elif item["type"] == "suggestions":
            with st.expander("Suggestions", expanded=True):
                for suggestion_index, suggestion in enumerate(item["suggestions"]):
                    if st.button(suggestion, key=f"{message_index}_{suggestion_index}"):
                        st.session_state.active_suggestion = suggestion
        elif item["type"] == "sql":
            display_sql(item["statement"])


@st.cache_data
def display_sql(sql: str) -> None:
    with st.expander("SQL Query", expanded=False):
        st.code(sql, language="sql")
    with st.expander("Results", expanded=True):
        with st.spinner("Running SQL..."):
            session = get_active_session()
            df = session.sql(sql).to_pandas()
            if len(df.index) > 1:
                data_tab, line_tab, bar_tab = st.tabs(
                    ["Data", "Line Chart", "Bar Chart"]
                )
                data_tab.dataframe(df)
                if len(df.columns) > 1:
                    df = df.set_index(df.columns[0])
                with line_tab:
                    st.line_chart(df)
                with bar_tab:
                    st.bar_chart(df)
            else:
                st.dataframe(df)


def show_conversation_history() -> None:
    for message_index, message in enumerate(st.session_state.messages):
        chat_role = "assistant" if message["role"] == "analyst" else "user"
        with st.chat_message(chat_role):
            display_content(
                content=message["content"],
                request_id=message.get("request_id"),
                message_index=message_index,
            )


def reset() -> None:
    st.session_state.messages = []
    st.session_state.suggestions = []
    st.session_state.active_suggestion = None


st.title("Cortex analyst")
st.markdown(f"Semantic Model: `{FILE}`")

if "messages" not in st.session_state:
    reset()

with st.sidebar:
    if st.button("Reset conversation"):
        reset()

show_conversation_history()

if user_input := st.chat_input("What is your question?"):
    process_message(prompt=user_input)

if st.session_state.active_suggestion:
    process_message(prompt=st.session_state.active_suggestion)
    st.session_state.active_suggestion = None
Copy

Interact with the Streamlit App

  1. Navigate to the Streamlit app in your browser or the Streamlit in Snowflake preview pane.

  2. Start asking questions about your data in natural language (e.g. “What questions can I ask?”).

Create a standalone Streamlit app

You can also use the example code above to build a standalone app.

  1. Install Streamlit.

  2. Create a Python file locally called analyst_api.py.

  3. Copy the Streamlit code below into the file.

  4. Replace the placeholder values with your account details.

  5. Run the Streamlit app using streamlit run analyst_api.py.

The database and schema specified in the following code is the stage location for the semantic model YAML file. The role used in the Snowflake connector should have access to underlying data defined in semantic model.

from typing import Any, Dict, List, Optional

import pandas as pd
import requests
import snowflake.connector
import streamlit as st


HOST = "<host>"
DATABASE = "<database>"
SCHEMA = "<schema>"
STAGE = "<stage>"
FILE = "<file>"

if 'CONN' not in st.session_state or st.session_state.CONN is None:
    st.session_state.CONN = snowflake.connector.connect(
        user="<user>",
        password="<password>",
        account="<account>",
        host=HOST,
        port=443,
        warehouse="<warehouse>",
        role="<role>",
    )


def send_message(prompt: str) -> Dict[str, Any]:
    """Calls the REST API and returns the response."""
    request_body = {
        "messages": [{"role": "user", "content": [{"type": "text", "text": prompt}]}],
        "semantic_model_file": f"@{DATABASE}.{SCHEMA}.{STAGE}/{FILE}",
    }
    resp = requests.post(
        url=f"https://{HOST}/api/v2/cortex/analyst/message",
        json=request_body,
        headers={
            "Authorization": f'Snowflake Token="{st.session_state.CONN.rest.token}"',
            "Content-Type": "application/json",
        },
    )
    request_id = resp.headers.get("X-Snowflake-Request-Id")
    if resp.status_code < 400:
        return {**resp.json(), "request_id": request_id}  # type: ignore[arg-type]
    else:
        st.session_state.messages.pop()
        raise Exception(
            f"Failed request (id: {request_id}) with status {resp.status_code}: {resp.text}"
        )


def process_message(prompt: str) -> None:
    """Processes a message and adds the response to the chat."""
    st.session_state.messages.append(
        {"role": "user", "content": [{"type": "text", "text": prompt}]}
    )
    with st.chat_message("user"):
        st.markdown(prompt)
    with st.chat_message("assistant"):
        with st.spinner("Generating response..."):
            response = send_message(prompt=prompt)
            request_id = response["request_id"]
            content = response["message"]["content"]
            st.session_state.messages.append(
                {**response['message'], "request_id": request_id}
            )
            display_content(content=content, request_id=request_id)  # type: ignore[arg-type]


def display_content(
    content: List[Dict[str, str]],
    request_id: Optional[str] = None,
    message_index: Optional[int] = None,
) -> None:
    """Displays a content item for a message."""
    message_index = message_index or len(st.session_state.messages)
    if request_id:
        with st.expander("Request ID", expanded=False):
            st.markdown(request_id)
    for item in content:
        if item["type"] == "text":
            st.markdown(item["text"])
        elif item["type"] == "suggestions":
            with st.expander("Suggestions", expanded=True):
                for suggestion_index, suggestion in enumerate(item["suggestions"]):
                    if st.button(suggestion, key=f"{message_index}_{suggestion_index}"):
                        st.session_state.active_suggestion = suggestion
        elif item["type"] == "sql":
            display_sql(item["statement"])


@st.cache_data
def display_sql(sql: str) -> None:
    with st.expander("SQL Query", expanded=False):
        st.code(sql, language="sql")
    with st.expander("Results", expanded=True):
        with st.spinner("Running SQL..."):
            df = pd.read_sql(sql, st.session_state.CONN)
            if len(df.index) > 1:
                data_tab, line_tab, bar_tab = st.tabs(
                    ["Data", "Line Chart", "Bar Chart"]
                )
                data_tab.dataframe(df)
                if len(df.columns) > 1:
                    df = df.set_index(df.columns[0])
                with line_tab:
                    st.line_chart(df)
                with bar_tab:
                    st.bar_chart(df)
            else:
                st.dataframe(df)


def show_conversation_history() -> None:
    for message_index, message in enumerate(st.session_state.messages):
        chat_role = "assistant" if message["role"] == "analyst" else "user"
        with st.chat_message(chat_role):
            display_content(
                content=message["content"],
                request_id=message.get("request_id"),
                message_index=message_index,
            )


def reset() -> None:
    st.session_state.messages = []
    st.session_state.suggestions = []
    st.session_state.active_suggestion = None


st.title("Cortex analyst")
st.markdown(f"Semantic Model: `{FILE}`")

if "messages" not in st.session_state:
    reset()

with st.sidebar:
    if st.button("Reset conversation"):
        reset()

show_conversation_history()

if user_input := st.chat_input("What is your question?"):
    process_message(prompt=user_input)

if st.session_state.active_suggestion:
    process_message(prompt=st.session_state.active_suggestion)
    st.session_state.active_suggestion = None
Copy

Disable Cortex Analyst Functionality

If you do not want Cortex Analyst to be available in your account, disable the feature by changing the ENABLE_CORTEX_ANALYST parameter using the ACCOUNTADMIN role:

USE ROLE ACCOUNTADMIN;
ALTER ACCOUNT SET ENABLE_CORTEX_ANALYST = FALSE;
Copy

Parameter Type

Session

Data Type

BOOLEAN

Description

Controls whether Cortex Analyst functionality is enabled in your account.

Values

  • FALSE: Cortex Analyst functionality is not available.

  • TRUE: Cortex Analyst functionality is available. If ENABLE_CORTEX_ANALYST_MODEL_AZURE_OPENAI is set to TRUE, Cortex Analyst can use Azure OpenAI models as well as Snowflake-hosted models. Otherwise, only Snowflake-hosted models can be used.

Default

TRUE

Cost considerations

Cortex Analyst is free to use until November 12, 2024 (PST). Starting on November 12, usage of Cortex Analyst will incur a cost of 67 credits per 1,000 messages. Each "message" represents a request/response pair.