Snowflake Cortex AISQL (including LLM functions)¶
Use Cortex AISQL in Snowflake to run unstructured analytics on text and images with industry-leading LLMs from OpenAI, Anthropic, Meta, Mistral AI, and DeepSeek. Cortex AISQL supports use cases such as:
Extracting entities to enrich metadata and streamline validation
Aggregating insights across customer tickets
Filtering and classifying content by natural language
Sentiment and aspect-based analysis for service improvement
Translating and localizing multilingual content
Parsing documents for analytics and RAG pipelines
All models are fully hosted in Snowflake, ensuring performance, scalability, and governance while keeping your data secure and in place.
Available functions¶
Snowflake Cortex features are provided as SQL functions and are also available in Python. Cortex AISQL functions can be grouped into the following categories:
AISQL functions¶
These task-specific functions are purpose-built managed functions that automate routine tasks, like simple summaries and quick translations, that don’t require any customization.
AI_COMPLETE: Generates a completion for a given text string or image using a selected LLM. Use this function for most generative AI tasks.
AI_COMPLETE is the updated version of COMPLETE (SNOWFLAKE.CORTEX).
AI_CLASSIFY: Classifies text or images into user-defined categories.
AI_CLASSIFY is the updated version of CLASSIFY_TEXT (SNOWFLAKE.CORTEX) with support for multi-label and image classification.
AI_FILTER: Returns True or False for a given text or image input, allowing you to filter results in
SELECT
,WHERE
, orJOIN ... ON
clauses.AI_AGG: Aggregates a text column and returns insights across multiple rows based on a user-defined prompt. This function isn’t subject to context window limitations.
AI_EMBED: Generates an embedding vector for a text or image input, which can be used for similarity search, clustering, and classification tasks.
AI_EMBED is the updated version of EMBED_TEXT_1024 (SNOWFLAKE.CORTEX).
AI_EXTRACT: Extracts information from an input string or file, for example, text, images, and documents. Supports multiple languages.
AI_EXTRACT is the updated version of EXTRACT_ANSWER (SNOWFLAKE.CORTEX).
AI_SENTIMENT: Extracts sentiment scores from text.
AI_SENTIMENT is the updated version of SENTIMENT (SNOWFLAKE.CORTEX).
AI_SUMMARIZE_AGG: Aggregates a text column and returns a summary across multiple rows. This function isn’t subject to context window limitations.
AI_SIMILARITY: Calculates the embedding similarity between two inputs.
AI_TRANSCRIBE: Transcribes audio files stored in a stage, extracting text, timestamps, and speaker information.
AI_PARSE_DOCUMENT: Extracts text (using OCR mode) or text with layout information (using LAYOUT mode) from documents in an internal or external stage.
AI_PARSE_DOCUMENT is the updated version of PARSE_DOCUMENT (SNOWFLAKE.CORTEX).
TRANSLATE (SNOWFLAKE.CORTEX): Translates text between supported languages.
SUMMARIZE (SNOWFLAKE.CORTEX): Returns a summary of the text that you’ve specified.
Helper functions¶
Helper functions are purpose-built managed functions that reduce cases of failures when running other AISQL functions, for example by getting the count of tokens in an input prompt to ensure the call doesn’t exceed a model limit.
TO_FILE: Creates a reference to a file in an internal or external stage for use with AI_COMPLETE and other functions that accept files.
COUNT_TOKENS (SNOWFLAKE.CORTEX): Given an input text, returns the token count based on the model or Cortex function specified.
PROMPT: Helps you build prompt objects for use with AI_COMPLETE and other functions.
TRY_COMPLETE (SNOWFLAKE.CORTEX): Works like the COMPLETE function, but returns NULL when the function could not execute instead of an error code.
Cortex Guard¶
Cortex Guard is an option of the AI_COMPLETE (or SNOWFLAKE.CORTEX.COMPLETE) function designed to filter possible unsafe and harmful responses from a language model. Cortex Guard is currently built with Meta’s Llama Guard 3. Cortex Guard works by evaluating the responses of a language model before that output is returned to the application. Once you activate Cortex Guard, language model responses which may be associated with violent crimes, hate, sexual content, self-harm, and more are automatically filtered. See COMPLETE arguments for syntax and examples.
Note
Usage of Cortex Guard incurs compute charges based on the number of input tokens processed, in addition to the charges for the AI_COMPLETE function.
Performance considerations¶
Cortex AISQL Functions are optimized for throughput. We recommend using these functions to process numerous inputs such as text from large SQL tables. Batch processing is typically better suited for AISQL Functions. For more interactive use cases where latency is important, use the REST API. These are available for simple inference (Complete API), embedding (Embed API) and agentic applications (Agents API).
Required privileges¶
The CORTEX_USER database role in the SNOWFLAKE database includes the privileges that allow users to call Snowflake Cortex AI functions. By default, the CORTEX_USER role is granted to the PUBLIC role. The PUBLIC role is automatically granted to all users and roles, so this allows all users in your account to use the Snowflake Cortex AI functions.
If you don’t want all users to have this privilege, you can revoke access to the PUBLIC role and grant access to other roles. The SNOWFLAKE.CORTEX_USER database role cannot be granted directly to a user. For more information, see Using SNOWFLAKE database roles.
To revoke the CORTEX_USER database role from the PUBLIC role, run the following commands using the ACCOUNTADMIN role:
REVOKE DATABASE ROLE SNOWFLAKE.CORTEX_USER
FROM ROLE PUBLIC;
REVOKE IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE
FROM ROLE PUBLIC;
You can then selectively provide access to specific roles. A user with the ACCOUNTADMIN role can grant this role to a custom role in
order to allow users to access Cortex AI functions. In the following example, use the ACCOUNTADMIN role and grant the user some_user
the CORTEX_USER database role via the account role cortex_user_role
, which you create for this purpose.
USE ROLE ACCOUNTADMIN;
CREATE ROLE cortex_user_role;
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE cortex_user_role;
GRANT ROLE cortex_user_role TO USER some_user;
You can also grant access to Snowflake Cortex AI functions through existing roles commonly used by specific groups of
users. (See User roles.) For example, if you have created an analyst
role that is used
as a default role by analysts in your organization, you can easily grant these users access to Snowflake Cortex AISQL
functions with a single GRANT statement.
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE analyst;
Control model access¶
Snowflake Cortex provides two independent mechanisms to enforce model access:
Account-level allowlist parameter (simple, broad control)
Role-based access control (RBAC) (fine-grained control)
You can use the account-level allowlist to control model access across your entire account, or you can use RBAC to control model access on a per-role basis. For maximum flexibility, you can also use both mechanisms together, if you can accept additional complexity.
Account-level allowlist parameter¶
You can control model access across your entire account using the CORTEX_MODELS_ALLOWLIST parameter. Supported features will respect the value of this parameter and prevent use of models that are not in the allowlist.
The CORTEX_MODELS_ALLOWLIST parameter can be set to 'All'
, 'None'
, or to a comma-separated list
of model names. This parameter can only be set at the account level, not at the user or session levels. Only the
ACCOUNTADMIN role can set the parameter using the ALTER ACCOUNT command.
Examples:
To allow access to all models:
ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'All';
To allow access to the
mistral-large2
andllama3.1-70b
models:ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'mistral-large2,llama3.1-70b';
To prevent access to any model:
ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'None';
Use RBAC, as described in the following section, to provide specific roles with access beyond what you’ve specified in the allowlist.
Role-based access control (RBAC)¶
Although Cortex models are not themselves Snowflake objects, Snowflake lets you create model objects in the SNOWFLAKE.MODELS schema that represent the Cortex models. By applying RBAC to these objects, you can control access to models the same way you would any other Snowflake object. Supported features accept the identifiers of objects in SNOWFLAKE.MODELS wherever a model can be specified.
Tip
To use RBAC exclusively, set CORTEX_MODELS_ALLOWLIST to 'None'
.
Refresh model objects and application roles¶
SNOWFLAKE.MODELS is not automatically populated with the objects that represent Cortex models. You must create these objects when you first set up model RBAC, and refresh them when you want to apply RBAC to new models.
As ACCOUNTADMIN, run the SNOWFLAKE.MODELS.CORTEX_BASE_MODELS_REFRESH stored procedure to populate the SNOWFLAKE.MODELS schema with objects representing currently available Cortex models, and to create application roles that correspond to the models. The procedure also creates CORTEX-MODEL-ROLE-ALL, a role that covers all models.
Tip
You can safely call CORTEX_BASE_MODELS_REFRESH at any time; it will not create duplicate objects or roles.
CALL SNOWFLAKE.MODELS.CORTEX_BASE_MODELS_REFRESH();
After refreshing the model objects, you can verify that the models appear in the SNOWFLAKE.MODELS schema as follows:
SHOW MODELS IN SNOWFLAKE.MODELS;
The returned list of models resembles the following:
created_on |
name |
model_type |
database_name |
schema_name |
owner |
---|---|---|---|---|---|
2025-04-22 09:35:38.558 -0700 |
CLAUDE-3-5-SONNET |
CORTEX_BASE |
SNOWFLAKE |
MODELS |
SNOWFLAKE |
2025-04-22 09:36:16.793 -0700 |
LLAMA3.1-405B |
CORTEX_BASE |
SNOWFLAKE |
MODELS |
SNOWFLAKE |
2025-04-22 09:37:18.692 -0700 |
SNOWFLAKE-ARCTIC |
CORTEX_BASE |
SNOWFLAKE |
MODELS |
SNOWFLAKE |
To verify that you can see the application roles associated with these models, use the SHOW APPLICATION ROLES command, as in the following example:
SHOW APPLICATION ROLES IN APPLICATION SNOWFLAKE;
The list of application roles resembles the following:
created_on |
name |
owner |
comment |
owner_role_type |
---|---|---|---|---|
2025-04-22 09:35:38.558 -0700 |
CORTEX-MODEL-ROLE-ALL |
SNOWFLAKE |
MODELS |
APPLICATION |
2025-04-22 09:36:16.793 -0700 |
CORTEX-MODEL-ROLE-LLAMA3.1-405B |
SNOWFLAKE |
MODELS |
APPLICATION |
2025-04-22 09:37:18.692 -0700 |
CORTEX-MODEL-ROLE-SNOWFLAKE-ARCTIC |
SNOWFLAKE |
MODELS |
APPLICATION |
Grant application roles to user roles¶
After you create the model objects and application roles, you can grant the application roles to specific user roles in your account.
To grant a role access to a specific model:
GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-LLAMA3.1-70B" TO ROLE MY_ROLE;
To grant a role access to all models (current and future models):
GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-ALL" TO ROLE MY_ROLE;
Use model objects with supported features¶
To use model objects with supported Cortex features, specify the identifier of the model object in SNOWFLAKE.MODELS as the model argument. You can use either a qualified identifier or a partial identifier, depending on your current database and schema context.
Using a fully-qualified identifier:
SELECT AI_COMPLETE('SNOWFLAKE.MODELS."LLAMA3.1-70B"', 'Hello');
Using a partial identifier:
USE DATABASE SNOWFLAKE; USE SCHEMA MODELS; SELECT AI_COMPLETE('LLAMA3.1-70B', 'Hello');
Using RBAC with account-level allowlist¶
A number of Cortex features accept a model name as a string argument, for example AI_COMPLETE('model', 'prompt')
. Cortex first treats this as the identifier of a schema-level model object. If the model object is found, RBAC is applied to determine whether the user can use the model. If no model object is found, the argument is interpreted as a plain model name and matched against the account-level allowlist.
The following example illustrates the use of allowlist and RBAC together. In this example, the allowlist is set to allow the mistral-large2
model, and the user has access to the LLAMA3.1-70B
model object through RBAC.
-- set up access
USE SECONDARY ROLES NONE;
USE ROLE ACCOUNTADMIN;
ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'MISTRAL-LARGE2';
CALL SNOWFLAKE.MODELS.CORTEX_BASE_MODELS_REFRESH();
GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-LLAMA3.1-70B" TO ROLE PUBLIC;
-- test access
USE ROLE PUBLIC;
-- this succeeds because mistral-large2 is in the allowlist
SELECT AI_COMPLETE('MISTRAL-LARGE2', 'Hello');
-- this succeeds because the role has access to the model object
SELECT AI_COMPLETE('SNOWFLAKE.MODELS."LLAMA3.1-70B"', 'Hello');
-- this fails because the first argument is
-- neither an identifier for an accessible model object
-- nor is it a model name in the allowlist
SELECT AI_COMPLETE('SNOWFLAKE-ARCTIC', 'Hello');
Common pitfalls¶
Access to a model (whether by allowlist or RBAC) does not always mean that it can be used. It may still be subject to cross-region, deprecation, or other availability constraints. These restrictions can result in error messages that seem similar to model access errors.
Model access controls only govern use of a model, and not the use of a feature itself, which may have its own access controls. For example, access to
AI_COMPLETE
is governed by theCORTEX_USER
database role. See Required privileges for more information.Not all features support model access controls. See the supported features table to see which access control methods a given feature supports.
Secondary roles can obscure permissions. For example, if a user has ACCOUNTADMIN as a secondary role, all model objects may appear accessible. Disable secondary roles temporarily when verifying permissions.
Keep in mind that you must use model object identifiers with RBAC and that these are quoted identifiers and are therefore case-sensitive. See QUOTED_IDENTIFIERS_IGNORE_CASE for more information.
Supported features¶
Model access controls are supported by the following features:
Feature |
Account-level allowlist |
Role-based access control |
Notes |
---|---|---|---|
✔ |
✔ |
||
✔ |
If the model powering this function is not allowed, the error message contains information about how to modify the allowlist. |
||
✔ |
If the model powering this function is not allowed, the error message contains information about how to modify the allowlist. |
||
✔ |
If the model powering this function is not allowed, the error message contains information about how to modify the allowlist. |
||
✔ |
If the model powering this function is not allowed, the error message contains information about how to modify the allowlist. |
||
✔ |
✔ |
||
✔ |
✔ |
||
✔ |
✔ |
||
✔ |
✔ |
Regional availability¶
Snowflake Cortex AI functions are available natively in the following regions. If your region is not listed for a particular function, use cross-region inference.
Note
The TRY_COMPLETE function is available in the same regions as COMPLETE.
The COUNT_TOKENS function is available in all regions for any model, but the models themselves are available only in the regions specified in the tables below.
The following models are available in any region via cross-region inference.
Function
(Model)
|
Cross Cloud (Any Region)
|
AWS US
(Cross-Region)
|
AWS EU
(Cross-Region)
|
AWS APJ
(Cross-Region)
|
Azure US
(Cross-Region)
|
---|---|---|---|---|---|
AI_COMPLETE
(
claude-4-sonnet ) |
✔ |
✔ |
|||
AI_COMPLETE
(
claude-4-opus ) |
In preview |
In preview |
|||
AI_COMPLETE
(
claude-3-7-sonnet ) |
✔ |
✔ |
✔ |
||
AI_COMPLETE
(
claude-3-5-sonnet ) |
✔ |
✔ |
|||
AI_COMPLETE
(
llama4-maverick ) |
✔ |
✔ |
|||
AI_COMPLETE
(
llama4-scout ) |
✔ |
✔ |
|||
AI_COMPLETE
(
llama3.2-1b ) |
✔ |
✔ |
|||
AI_COMPLETE
(
llama3.2-3b ) |
✔ |
✔ |
|||
AI_COMPLETE
(
llama3.1-8b ) |
✔ |
✔ |
✔ |
✔ |
✔ |
AI_COMPLETE
(
llama3.1-70b ) |
✔ |
✔ |
✔ |
✔ |
✔ |
AI_COMPLETE
(
llama3.3-70b ) |
✔ |
✔ |
|||
AI_COMPLETE
(
snowflake-llama-3.3-70b ) |
✔ |
✔ |
|||
AI_COMPLETE
(
llama3.1-405b ) |
✔ |
✔ |
✔ |
||
AI_COMPLETE
(
openai-gpt-4.1 ) |
In preview |
In preview |
|||
AI_COMPLETE
(
openai-o4-mini ) |
In preview |
In preview |
|||
AI_COMPLETE
(
openai-gpt-5 ) |
In preview |
In preview |
|||
AI_COMPLETE
(
openai-gpt-5-mini ) |
In preview |
In preview |
|||
AI_COMPLETE
(
openai-gpt-5-nano ) |
In preview |
In preview |
|||
AI_COMPLETE
(
openai-gpt-5-chat ) |
In preview |
||||
AI_COMPLETE
(
openai-gpt-oss-120b ) |
In preview |
||||
AI_COMPLETE
(
openai-gpt-oss-20b ) |
In preview |
||||
AI_COMPLETE
(
snowflake-llama-3.1-405b ) |
✔ |
✔ |
|||
AI_COMPLETE
(
snowflake-arctic ) |
✔ |
✔ |
✔ |
||
AI_COMPLETE
(
deepseek-r1 ) |
✔ |
✔ |
|||
AI_COMPLETE
(
reka-core ) |
✔ |
✔ |
|||
AI_COMPLETE
(
reka-flash ) |
✔ |
✔ |
✔ |
||
AI_COMPLETE
(
mistral-large2 ) |
✔ |
✔ |
✔ |
✔ |
|
AI_COMPLETE
(
mixtral-8x7b ) |
✔ |
✔ |
✔ |
✔ |
✔ |
AI_COMPLETE
(
mistral-7b ) |
✔ |
✔ |
✔ |
✔ |
✔ |
AI_COMPLETE
(
jamba-instruct ) |
✔ |
✔ |
✔ |
✔ |
✔ |
AI_COMPLETE
(
jamba-1.5-mini ) |
✔ |
✔ |
✔ |
✔ |
✔ |
AI_COMPLETE
(
jamba-1.5-large ) |
✔ |
✔ |
✔ |
||
AI_COMPLETE
(
gemma-7b ) |
✔ |
✔ |
✔ |
✔ |
|
EMBED_TEXT_768
(
e5-base-v2 ) |
✔ |
✔ |
✔ |
✔ |
✔ |
EMBED_TEXT_768
(
snowflake-arctic-embed-m ) |
✔ |
✔ |
✔ |
✔ |
✔ |
EMBED_TEXT_768
(
snowflake-arctic-embed-m-v1.5 ) |
✔ |
✔ |
✔ |
✔ |
✔ |
EMBED_TEXT_1024
(
snowflake-arctic-embed-l-v2.0 ) |
✔ |
✔ |
✔ |
✔ |
✔ |
EMBED_TEXT_1024
(
snowflake-arctic-embed-l-v2.0-8k ) |
✔ |
✔ |
✔ |
✔ |
✔ |
EMBED_TEXT_1024
(
nv-embed-qa-4 ) |
✔ |
✔ |
|||
EMBED_TEXT_1024
(
multilingual-e5-large ) |
✔ |
✔ |
✔ |
✔ |
✔ |
EMBED_TEXT_1024
(
voyage-multilingual-2 ) |
✔ |
✔ |
✔ |
✔ |
✔ |
AI_CLASSIFY TEXT
|
✔ |
✔ |
✔ |
✔ |
✔ |
AI_CLASSIFY IMAGE
|
✔ |
||||
AI_EXTRACT
|
✔ |
✔ |
✔ |
✔ |
✔ |
AI_FILTER TEXT
|
✔ |
✔ |
✔ |
✔ |
✔ |
AI_FILTER IMAGE
|
✔ |
||||
AI_AGG
|
✔ |
✔ |
✔ |
✔ |
✔ |
AI_SENTIMENT
|
✔ |
✔ |
✔ |
✔ |
✔ |
AI_SIMILARITY TEXT
|
✔ |
✔ |
✔ |
✔ |
✔ |
AI_SIMILARITY IMAGE
|
✔ |
✔ |
✔ |
||
AI_SUMMARIZE_AGG
|
✔ |
✔ |
✔ |
✔ |
✔ |
EXTRACT_ANSWER
|
✔ |
✔ |
✔ |
✔ |
✔ |
SENTIMENT
|
✔ |
✔ |
✔ |
✔ |
✔ |
ENTITY_SENTIMENT
|
✔ |
✔ |
✔ |
✔ |
✔ |
SUMMARIZE
|
✔ |
✔ |
✔ |
✔ |
✔ |
TRANSLATE
|
✔ |
✔ |
✔ |
✔ |
✔ |
The following functions and models are available natively in North American regions.
Function
(Model)
|
AWS US West 2
(Oregon)
|
AWS US East 1
(N. Virginia)
|
AWS US East
(Commercial Gov - N. Virginia)
|
Azure East US 2
(Virginia)
|
Azure East US
(Virginia)
|
Azure West US
(Washington)
|
Azure West US 3
(Arizona)
|
Azure North Central US
(Illinois)
|
Azure South Central US
(Texas)
|
---|---|---|---|---|---|---|---|---|---|
AI_COMPLETE
(
claude-4-sonnet ) |
✔ |
||||||||
AI_COMPLETE
(
claude-4-opus ) |
In preview |
||||||||
AI_COMPLETE
(
claude-3-7-sonnet ) |
✔ |
||||||||
AI_COMPLETE
(
claude-3-5-sonnet ) |
✔ |
✔ |
|||||||
AI_COMPLETE
(
llama4-maverick ) |
✔ |
||||||||
AI_COMPLETE
(
llama4-scout ) |
✔ |
||||||||
AI_COMPLETE
(
llama3.2-1b ) |
✔ |
||||||||
AI_COMPLETE
(
llama3.2-3b ) |
✔ |
||||||||
AI_COMPLETE
(
llama3.1-8b ) |
✔ |
✔ |
✔ |
✔ |
|||||
AI_COMPLETE
(
llama3.1-70b ) |
✔ |
✔ |
✔ |
✔ |
|||||
AI_COMPLETE
(
llama3.3-70b ) |
✔ |
||||||||
AI_COMPLETE
(
snowflake-llama-3.3-70b ) |
✔ |
||||||||
AI_COMPLETE
(
llama3.1-405b ) |
✔ |
✔ |
✔ |
✔ |
|||||
AI_COMPLETE
(
openai-gpt-4.1 ) |
In preview |
||||||||
AI_COMPLETE
(
openai-o4-mini ) |
In preview |
||||||||
AI_COMPLETE
(
openai-gpt-oss-120b ) |
In preview |
||||||||
AI_COMPLETE
(
openai-gpt-oss-20b ) |
In preview |
In preview |
|||||||
AI_COMPLETE
(
snowflake-llama-3.1-405b ) |
✔ |
||||||||
AI_COMPLETE
(
snowflake-arctic ) |
✔ |
✔ |
|||||||
AI_COMPLETE
(
deepseek-r1 ) |
✔ |
||||||||
AI_COMPLETE
(
reka-core ) |
✔ |
✔ |
|||||||
AI_COMPLETE
(
reka-flash ) |
✔ |
✔ |
✔ |
||||||
AI_COMPLETE
(
mistral-large2 ) |
✔ |
✔ |
✔ |
✔ |
|||||
AI_COMPLETE
(
mixtral-8x7b ) |
✔ |
✔ |
✔ |
✔ |
|||||
AI_COMPLETE
(
mistral-7b ) |
✔ |
✔ |
✔ |
✔ |
|||||
AI_COMPLETE
(
jamba-instruct ) |
✔ |
||||||||
AI_COMPLETE
(
jamba-1.5-mini ) |
✔ |
||||||||
AI_COMPLETE
(
jamba-1.5-large ) |
✔ |
||||||||
AI_COMPLETE
(
gemma-7b ) |
✔ |
✔ |
✔ |
✔ |
|||||
EMBED_TEXT_768
(
e5-base-v2 ) |
✔ |
✔ |
✔ |
✔ |
|||||
EMBED_TEXT_768
(
snowflake-arctic-embed-m ) |
✔ |
✔ |
✔ |
✔ |
|||||
EMBED_TEXT_768
(
snowflake-arctic-embed-m-v1.5 ) |
✔ |
✔ |
✔ |
✔ |
|||||
EMBED_TEXT_1024
(
snowflake-arctic-embed-l-v2.0 ) |
✔ |
✔ |
✔ |
✔ |
|||||
EMBED_TEXT_1024
(
snowflake-arctic-embed-l-v2.0-8k ) |
✔ |
✔ |
✔ |
✔ |
|||||
EMBED_TEXT_1024
(
nv-embed-qa-4 ) |
✔ |
||||||||
EMBED_TEXT_1024
(
multilingual-e5-large ) |
✔ |
✔ |
✔ |
✔ |
|||||
EMBED_TEXT_1024
(
voyage-multilingual-2 ) |
✔ |
✔ |
✔ |
✔ |
|||||
AI_CLASSIFY TEXT
|
✔ |
✔ |
✔ |
||||||
AI_CLASSIFY IMAGE
|
✔ |
✔ |
|||||||
AI_EXTRACT
|
✔ |
✔ |
✔ |
✔ |
✔ |
||||
AI_FILTER TEXT
|
✔ |
✔ |
✔ |
||||||
AI_FILTER IMAGE
|
✔ |
✔ |
|||||||
AI_AGG
|
✔ |
✔ |
✔ |
||||||
AI_SIMILARITY TEXT
|
✔ |
✔ |
✔ |
||||||
AI_SIMILARITY IMAGE
|
✔ |
✔ |
|||||||
AI_SUMMARIZE_AGG
|
✔ |
✔ |
✔ |
||||||
AI_TRANSCRIBE
|
|
|
|
||||||
EXTRACT_ANSWER |
✔ |
✔ |
✔ |
✔ |
|||||
SENTIMENT |
✔ |
✔ |
✔ |
✔ |
|||||
ENTITY_SENTIMENT |
✔ |
✔ |
✔ |
✔ |
|||||
SUMMARIZE |
✔ |
✔ |
✔ |
✔ |
|||||
TRANSLATE |
✔ |
✔ |
✔ |
✔ |
The following functions and models are available natively in European regions.
Function
(Model)
|
AWS Europe Central 1
(Frankfurt)
|
AWS Europe West 1
(Ireland)
|
Azure West Europe
(Netherlands)
|
---|---|---|---|
AI_COMPLETE
(
claude-4-sonnet ) |
|||
AI_COMPLETE
(
claude-4-opus ) |
|||
AI_COMPLETE
(
claude-3-7-sonnet ) |
|||
AI_COMPLETE
(
claude-3-5-sonnet ) |
|||
AI_COMPLETE
(
llama4-maverick ) |
|||
AI_COMPLETE
(
llama4-scout ) |
|||
AI_COMPLETE
(
llama3.2-1b ) |
|||
AI_COMPLETE
(
llama3.2-3b ) |
|||
AI_COMPLETE
(
llama3.1-8b ) |
✔ |
✔ |
✔ |
AI_COMPLETE
(
llama3.1-70b ) |
✔ |
✔ |
✔ |
AI_COMPLETE
(
llama3.3-70b ) |
|||
AI_COMPLETE
(
snowflake-llama-3.3-70b ) |
|||
AI_COMPLETE
(
llama3.1-405b ) |
|||
AI_COMPLETE
(
openai-gpt-4.1 ) |
|||
AI_COMPLETE
(
openai-o4-mini ) |
|||
AI_COMPLETE
(
openai-gpt-oss-120b ) |
|||
AI_COMPLETE
(
openai-gpt-oss-20b ) |
|||
AI_COMPLETE
(
snowflake-llama-3.1-405b ) |
|||
AI_COMPLETE
(
snowflake-arctic ) |
|||
AI_COMPLETE
(
deepseek-r1 ) |
|||
AI_COMPLETE
(
reka-core ) |
|||
AI_COMPLETE
(
reka-flash ) |
|||
AI_COMPLETE
(
mistral-large2 ) |
✔ |
✔ |
✔ |
AI_COMPLETE
(
mixtral-8x7b ) |
✔ |
✔ |
✔ |
AI_COMPLETE
(
mistral-7b ) |
✔ |
✔ |
✔ |
AI_COMPLETE
(
jamba-instruct ) |
✔ |
||
AI_COMPLETE
(
jamba-1.5-mini ) |
✔ |
||
AI_COMPLETE
(
jamba-1.5-large ) |
|||
AI_COMPLETE
(
gemma-7b ) |
✔ |
✔ |
|
EMBED_TEXT_768
(
e5-base-v2 ) |
✔ |
✔ |
|
EMBED_TEXT_768
(
snowflake-arctic-embed-m ) |
✔ |
✔ |
✔ |
EMBED_TEXT_768
(
snowflake-arctic-embed-m-v1.5 ) |
✔ |
✔ |
✔ |
EMBED_TEXT_1024
(
snowflake-arctic-embed-l-v2.0 ) |
✔ |
✔ |
✔ |
EMBED_TEXT_1024
(
snowflake-arctic-embed-l-v2.0-8k ) |
✔ |
✔ |
✔ |
EMBED_TEXT_1024
(
nv-embed-qa-4 ) |
|||
EMBED_TEXT_1024
(
multilingual-e5-large ) |
✔ |
✔ |
✔ |
EMBED_TEXT_1024
(
voyage-multilingual-2 ) |
✔ |
✔ |
✔ |
AI_CLASSIFY TEXT
|
✔ |
✔ |
✔ |
AI_CLASSIFY IMAGE
|
✔ |
||
AI_EXTRACT
|
✔ |
✔ |
✔ |
AI_FILTER TEXT
|
✔ |
✔ |
✔ |
AI_FILTER IMAGE
|
✔ |
||
AI_AGG
|
✔ |
✔ |
✔ |
AI_SIMILARITY TEXT
|
✔ |
✔ |
✔ |
AI_SIMILARITY IMAGE
|
✔ |
||
AI_SUMMARIZE_AGG
|
✔ |
✔ |
✔ |
AI_TRANSCRIBE
|
|
||
EXTRACT_ANSWER |
✔ |
✔ |
✔ |
SENTIMENT |
✔ |
✔ |
✔ |
ENTITY_SENTIMENT |
✔ |
✔ |
|
SUMMARIZE |
✔ |
✔ |
✔ |
TRANSLATE |
✔ |
✔ |
✔ |
The following functions and models are available natively in Asia Pacific regions.
Function
(Model)
|
AWS AP Southeast 2
(Sydney)
|
AWS AP Northeast 1
(Tokyo)
|
---|---|---|
AI_COMPLETE
(
claude-4-sonnet ) |
||
AI_COMPLETE
(
claude-4-opus ) |
||
AI_COMPLETE
(
claude-3-7-sonnet ) |
||
AI_COMPLETE
(
claude-3-5-sonnet ) |
✔ |
|
AI_COMPLETE
(
llama4-maverick ) |
||
AI_COMPLETE
(
llama4-scout ) |
||
AI_COMPLETE
(
llama3.2-1b ) |
||
AI_COMPLETE
(
llama3.2-3b ) |
||
AI_COMPLETE
(
llama3.1-8b ) |
✔ |
✔ |
AI_COMPLETE
(
llama3.1-70b ) |
✔ |
✔ |
AI_COMPLETE
(
llama3.3-70b ) |
||
AI_COMPLETE
(
snowflake-llama-3.3-70b ) |
||
AI_COMPLETE
(
llama3.1-405b ) |
||
AI_COMPLETE
(
openai-gpt-4.1 ) |
||
AI_COMPLETE
(
openai-o4-mini ) |
||
AI_COMPLETE
(
snowflake-llama-3.1-405b ) |
||
AI_COMPLETE
(
snowflake-arctic ) |
||
AI_COMPLETE
(
deepseek-r1 ) |
||
AI_COMPLETE
(
reka-core ) |
||
AI_COMPLETE
(
reka-flash ) |
✔ |
|
AI_COMPLETE
(
mistral-large2 ) |
✔ |
✔ |
AI_COMPLETE
(
mixtral-8x7b ) |
✔ |
✔ |
AI_COMPLETE
(
mistral-7b ) |
✔ |
✔ |
AI_COMPLETE
(
jamba-instruct ) |
✔ |
|
AI_COMPLETE
(
jamba-1.5-mini ) |
✔ |
|
AI_COMPLETE
(
jamba-1.5-large ) |
||
AI_COMPLETE
(
gemma-7b ) |
||
EMBED_TEXT_768
(
e5-base-v2 ) |
✔ |
✔ |
EMBED_TEXT_768
(
snowflake-arctic-embed-m ) |
✔ |
✔ |
EMBED_TEXT_768
(
snowflake-arctic-embed-m-v1.5 ) |
✔ |
✔ |
EMBED_TEXT_1024
(
snowflake-arctic-embed-l-v2.0 ) |
✔ |
✔ |
EMBED_TEXT_1024
(
snowflake-arctic-embed-l-v2.0-8k ) |
✔ |
✔ |
EMBED_TEXT_1024
(
nv-embed-qa-4 ) |
||
EMBED_TEXT_1024
(
multilingual-e5-large ) |
✔ |
✔ |
EMBED_TEXT_1024
(
voyage-multilingual-2 ) |
✔ |
✔ |
AI_EXTRACT
|
✔ |
✔ |
AI_CLASSIFY TEXT
|
✔ |
✔ |
AI_CLASSIFY IMAGE
|
||
AI_FILTER TEXT
|
✔ |
✔ |
AI_FILTER IMAGE
|
||
AI_AGG
|
✔ |
✔ |
AI_SIMILARITY TEXT
|
✔ |
✔ |
AI_SIMILARITY IMAGE
|
||
AI_SUMMARIZE_AGG
|
✔ |
✔ |
AI_TRANSCRIBE
|
||
EXTRACT_ANSWER |
✔ |
✔ |
SENTIMENT |
✔ |
✔ |
ENTITY_SENTIMENT |
✔ |
|
SUMMARIZE |
✔ |
✔ |
TRANSLATE |
✔ |
✔ |
The following Snowflake Cortex AI functions are currently available in the following extended regions.
Function
(Model)
|
AWS US East 2
(Ohio)
|
AWS CA Central 1
(Central)
|
AWS SA East 1
(São Paulo)
|
AWS Europe West 2
(London)
|
AWS Europe Central 1
(Frankfurt)
|
AWS Europe North 1
(Stockholm)
|
AWS AP Northeast 1
(Tokyo)
|
AWS AP South 1
(Mumbai)
|
AWS AP Southeast 2
(Sydney)
|
AWS AP Southeast 3
(Jakarta)
|
Azure South Central US
(Texas)
|
Azure West US 2
(Washington)
|
Azure UK South
(London)
|
Azure North Europe
(Ireland)
|
Azure Switzerland North
(Zürich)
|
Azure Central India
(Pune)
|
Azure Japan East
(Tokyo, Saitama)
|
Azure Southeast Asia
(Singapore)
|
Azure Australia East
(New South Wales)
|
GCP Europe West 2
(London)
|
GCP Europe West 4
(Netherlands)
|
GCP US Central 1
(Iowa)
|
GCP US East 4
(N. Virginia)
|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
EMBED_TEXT_768
(
snowflake-arctic-embed-m-v1.5 ) |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
EMBED_TEXT_768
(
snowflake-arctic-embed-m ) |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
EMBED_TEXT_1024
(
multilingual-e5-large ) |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
AI_EXTRACT
|
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
Cross-region only |
Cross-region only |
Cross-region only |
Cross-region only |
The following table lists legacy models. If you’re just getting started, start with models in the previous table.
Function
(Model)
|
AWS US West 2
(Oregon)
|
AWS US East 1
(N. Virginia)
|
AWS Europe Central 1
(Frankfurt)
|
AWS Europe West 1
(Ireland)
|
AWS AP Southeast 2
(Sydney)
|
AWS AP Northeast 1
(Tokyo)
|
Azure East US 2
(Virginia)
|
Azure West Europe
(Netherlands)
|
---|---|---|---|---|---|---|---|---|
AI_COMPLETE
(
llama2-70b-chat ) |
✔ |
✔ |
✔ |
✔ |
✔ |
|||
AI_COMPLETE
(
llama3-8b ) |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
||
AI_COMPLETE
(
llama3-70b ) |
✔ |
✔ |
✔ |
✔ |
✔ |
|||
AI_COMPLETE
(
mistral-large ) |
✔ |
✔ |
✔ |
✔ |
✔ |
Cost considerations¶
Snowflake Cortex AI functions incur compute cost based on the number of tokens processed. Refer to the Snowflake Service Consumption Table for each function’s cost in credits per million tokens.
A token is the smallest unit of text processed by Snowflake Cortex AI functions. An industry convention for text is that a token is approximately equal to four characters, although this can vary by model, as can token equivalence for media files.
For functions that generate new text in the response (AI_COMPLETE, AI_CLASSIFY, AI_FILTER, AI_AGG, AI_SUMMARIZE, and TRANSLATE, and their previous versions in the SNOWFLAKE.CORTEX schema), both input and output tokens are billable.
For Cortex Guard, only input tokens are counted. The number of input tokens is based on the number of tokens output from AI_COMPLETE (or COMPLETE). Cortex Guard usage is billed in addition to the cost of the AI_COMPLETE (or COMPLETE) function.
For AI_SIMILARITY and the EMBED_* functions, only input tokens are counted.
For EXTRACT_ANSWER, the number of billable tokens is the sum of the number of tokens in the
from_text
andquestion
fields.AI_CLASSIFY, AI_FILTER, AI_AGG, AI_SENTIMENT, AI_SUMMARIZE_AGG, SUMMARIZE, TRANSLATE, EXTRACT_ANSWER, ENTITY_SENTIMENT, and SENTIMENT add a prompt to the input text in order to generate the response. As a result, the input token count is higher than the number of tokens in the text you provide.
AI_CLASSIFY labels, descriptions, and examples are counted as input tokens for each record processed, not just once for each AI_CLASSIFY call.
For AI_PARSE_DOCUMENT (or SNOWFLAKE.CORTEX.PARSE_DOCUMENT), billing is based on the number of document pages processed.
TRY_COMPLETE (SNOWFLAKE.CORTEX) does not incur costs for error handling. If the TRY_COMPLETE(SNOWFLAKE.CORTEX) function returns NULL, no cost is incurred.
For AI_EXTRACT, both input and output tokens are counted. The
responseFormat
argument is counted as input tokens. For document formats consisting of pages, nthe umber of pages processed is counted as input token. Each page in a document is counted as 970 tokens.COUNT_TOKENS (SNOWFLAKE.CORTEX) incurs only compute cost to run the function. No additional token-based costs are incurred.
For models that support media files such as images or audio:
Audio files are billed at 50 tokens per second of audio.
The token equivalence of images is determined by the model used. For more information, see AI Image cost considerations.
Snowflake recommends executing queries that call a Snowflake Cortex AISQL function with a smaller warehouse (no larger than MEDIUM). Larger warehouses do not increase performance. The cost associated with keeping a warehouse active continues to apply when executing a query that calls a Snowflake Cortex LLM Function. For general information on compute costs, see Understanding compute cost.
Track costs for AI services¶
To track credits used for AI Services including LLM Functions in your account, use the METERING_HISTORY view:
SELECT *
FROM SNOWFLAKE.ACCOUNT_USAGE.METERING_DAILY_HISTORY
WHERE SERVICE_TYPE='AI_SERVICES';
Track credit consumption for AISQL functions¶
To view the credit and token consumption for each AISQL function call, use the CORTEX_FUNCTIONS_USAGE_HISTORY view:
SELECT *
FROM SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_USAGE_HISTORY;
You can also view the credit and token consumption for each query within your Snowflake account. Viewing the credit and token consumption for each query helps you identify queries that are consuming the most credits and tokens.
The following example query uses the CORTEX_FUNCTIONS_QUERY_USAGE_HISTORY view to show the credit and token consumption for all of your queries within your account.
SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_QUERY_USAGE_HISTORY;
You can also use the same view to see the credit and token consumption for a specific query.
SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_QUERY_USAGE_HISTORY
WHERE query_id='<query-id>';
Note
You can’t get granular usage information for requests made with the REST API.
The query usage history is grouped by the models used in the query. For example, if you ran:
SELECT AI_COMPLETE('mistral-7b', 'Is a hot dog a sandwich'), AI_COMPLETE('mistral-large', 'Is a hot dog a sandwich');
The query usage history would show two rows, one for mistral-7b
and one for mistral-large
.
Usage quotas¶
On-demand Snowflake accounts without a valid payment method (such as trial accounts) are limited to 10 credits per day for Snowflake Cortex AISQL usage. To remove this limit, convert your trial account to a paid account.
Managing costs¶
Snowflake recommends using a warehouse size no larger than MEDIUM when calling Snowflake Cortex AISQL functions. Using a larger warehouse than necessary does not increase performance, but can result in unnecessary costs. This recommendation may change in the future as we continue to evolve Cortex AISQL functions.
Model restrictions¶
Models used by Snowflake Cortex have limitations on size as described in the table below. Sizes are given in tokens. Tokens generally represent about four characters of text, so the number of words corresponding to a limit is less than the number of tokens. Inputs that exceed the limit result in an error.
The maximum size of the output that a model can produce is limited by the following:
The model’s output token limit.
The space available in the context window after the model consumes the input tokens.
For example, claude-3-5-sonnet
has a context window of 200,000 tokens. If 100,000 tokens are used for the input, the model can generate up to 8,192 tokens. However, if 195,000 tokens are used as input, then the model can only generate up to 5,000 tokens for a total of 200,000 tokens.
Important
In the AWS AP Southeast 2 (Sydney) region:
the context window for
llama3-8b
andmistral-7b
is 4,096 tokens.the context window for
llama3.1-8b
is 16,384 tokens.the context window for the Snowflake managed model from the SUMMARIZE function is 4,096 tokens.
In the AWS Europe West 1 (Ireland) region:
the context window for
llama3.1-8b
is 16,384 tokens.the context window for
mistral-7b
is 4,096 tokens.
Function |
Model |
Context window (tokens) |
Max output AISQL functions (tokens) |
---|---|---|---|
COMPLETE |
|
128,000 |
8,192 |
|
128,000 |
8,192 |
|
|
4,096 |
8,192 |
|
|
32,768 |
8,192 |
|
|
200,000 |
8,192 |
|
|
200,000 |
32,000 |
|
|
200,000 |
32,000 |
|
|
200,000 |
8,192 |
|
|
32,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
128,000 |
32,000 |
|
|
200,000 |
32,000 |
|
|
272,000 |
8,192 |
|
|
272,000 |
8,192 |
|
|
272,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
100,000 |
8,192 |
|
|
32,000 |
8,192 |
|
|
256,000 |
8,192 |
|
|
256,000 |
8,192 |
|
|
256,000 |
8,192 |
|
|
32,000 |
8,192 |
|
|
4,096 |
8,192 |
|
|
8,000 |
8,192 |
|
|
8,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
8,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
128,000 |
8,192 |
|
|
32,000 |
8,192 |
|
|
8,000 |
8,192 |
|
EMBED_TEXT_768 |
|
512 |
n/a |
|
512 |
n/a |
|
EMBED_TEXT_1024 |
|
512 |
n/a |
|
512 |
n/a |
|
|
32,000 |
n/a |
|
AI_EXTRACT |
|
128,000 |
51,200 |
AI_FILTER |
Snowflake managed model |
128,000 |
n/a |
AI_CLASSIFY |
Snowflake managed model |
128,000 |
n/a |
AI_AGG |
Snowflake managed model |
128,000 per row
can be used across multiple rows
|
8,192 |
AI_SENTIMENT |
Snowflake managed model |
2,048 |
n/a |
AI_SUMMARIZE_AGG |
Snowflake managed model |
128,000 per row
can be used across multiple rows
|
8,192 |
ENTITY_SENTIMENT |
Snowflake managed model |
2,048 |
n/a |
EXTRACT_ANSWER |
Snowflake managed model |
2,048 for text
64 for question
|
n/a |
SENTIMENT |
Snowflake managed model |
512 |
n/a |
SUMMARIZE |
Snowflake managed model |
32,000 |
4,096 |
TRANSLATE |
Snowflake managed model |
4,096 |
n/a |
Choosing a model¶
The Snowflake Cortex AI_COMPLETE function supports multiple models of varying capability, latency, and cost. These models have been carefully chosen to align with common customer use cases. To achieve the best performance per credit, choose a model that’s a good match for the content size and complexity of your task. Here are brief overviews of the available models.
Large models¶
If you’re not sure where to start, try the most capable models first to establish a baseline to evaluate other models.
claude-3-7-sonnet
, reka-core
, and mistral-large2
are the most capable models offered by Snowflake Cortex,
and will give you a good idea what a state-of-the-art model can do.
Claude 3-7 Sonnet
is a leader in general reasoning and multimodal capabilities. It outperforms its predecessors in tasks that require reasoning across different domains and modalities. You can use its large output capacity to get more information from either structured or unstructured queries. Its reasoning capabilities and large context windows make it well-suited for agentic workflows.deepseek-r1
is a foundation model trained using large-scale reinforcement-learning (RL) without supervised fine-tuning (SFT). It can deliver high performance across math, code, and reasoning tasks. To access the model, set the cross-region inference parameter toAWS_US
.mistral-large2
is Mistral AI’s most advanced large language model with top-tier reasoning capabilities. Compared tomistral-large
, it’s significantly more capable in code generation, mathematics, reasoning, and provides much stronger multilingual support. It’s ideal for complex tasks that require large reasoning capabilities or are highly specialized, such as synthetic text generation, code generation, and multilingual text analytics.llama3.1-405b
is an open source model from thellama3.1
model family from Meta with a large 128K context window. It excels in long document processing, multilingual support, synthetic data generation and model distillation.snowflake-llama3.1-405b
is a model derived from the open source llama3.1 model. It uses the SwiftKV optimizations that have been developed by the Snowflake AI research team to deliver up to a 75% inference cost reduction. SwiftKV achieves higher throughput performance with minimal accuracy loss.
Medium models¶
llama3.1-70b
is an open source model that demonstrates state-of-the-art performance ideal for chat applications, content creation, and enterprise applications. It is a highly performant, cost effective model that enables diverse use cases with a context window of 128K.llama3-70b
is still supported and has a context window of 8K.snowflake-llama3.3-70b
is a model derived from the open source llama3.3 model. It uses the<SwiftKV optimizations https://www.snowflake.com/en/blog/up-to-75-lower-inference-cost-llama-meta-llm/>
developed by the Snowflake AI research team to deliver up to a 75% inference cost reduction. SwiftKV achieves higher throughput performance with minimal accuracy loss.snowflake-arctic
is Snowflake’s top-tier enterprise-focused LLM. Arctic excels at enterprise tasks such as SQL generation, coding and instruction following benchmarks.mixtral-8x7b
is ideal for text generation, classification, and question answering. Mistral models are optimized for low latency with low memory requirements, which translates into higher throughput for enterprise use cases.The
jamba-Instruct
model is built by AI21 Labs to efficiently meet enterprise requirements. It is optimized to offer a 256k token context window with low cost and latency, making it ideal for tasks like summarization, Q&A, and entity extraction on lengthy documents and extensive knowledge bases.The AI21 Jamba 1.5 family of models is state-of-the-art, hybrid SSM-Transformer instruction following foundation models. The
jamba-1.5-mini
andjamba-1.5-large
with a context length of 256K supports use cases such as structured output (JSON), and grounded generation.
Small models¶
The
llama3.2-1b
andllama3.2-3b
models support context length of 128K tokens and are state-of-the-art in their class for use cases like summarization, instruction following, and rewriting tasks. The Llama 3.2 models deliver multilingual capabilities, with support for English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.llama3.1-8b
is ideal for tasks that require low to moderate reasoning. It’s a light-weight, ultra-fast model with a context window of 128K.llama3-8b
andllama2-70b-chat
are still supported models that provide a smaller context window and relatively lower accuracy.mistral-7b
is ideal for your simplest summarization, structuration, and question answering tasks that need to be done quickly. It offers low latency and high throughput processing for multiple pages of text with its 32K context window.gemma-7b
is suitable for simple code and text completion tasks. It has a context window of 8,000 tokens but is surprisingly capable within that limit, and quite cost-effective.
The following table provides information on how popular models perform on various benchmarks, including the models offered by Snowflake Cortex AI_COMPLETE as well as a few other popular models.
Model |
Context Window
(Tokens)
|
MMLU
(Reasoning)
|
HumanEval
(Coding)
|
GSM8K
(Arithmetic Reasoning)
|
Spider 1.0
(SQL)
|
---|---|---|---|---|---|
128,000 |
88.7 |
90.2 |
96.4 |
- |
|
200,000 |
88.3 |
92.0 |
96.4 |
- |
|
128,000 |
88.6 |
89 |
96.8 |
- |
|
32,000 |
83.2 |
76.8 |
92.2 |
- |
|
128,000 |
86 |
80.5 |
95.1 |
- |
|
128,000 |
84 |
92 |
93 |
- |
|
100,000 |
75.9 |
72 |
81 |
- |
|
128,000 |
73 |
72.6 |
84.9 |
- |
|
32,000 |
70.6 |
40.2 |
60.4 |
- |
|
256,000 |
68.2 |
40 |
59.9 |
- |
|
256,000 |
69.7 |
- |
75.8 |
- |
|
256,000 |
81.2 |
- |
87 |
- |
|
4,096 |
67.3 |
64.3 |
69.7 |
79 |
|
128,000 |
49.3 |
- |
44.4 |
- |
|
128,000 |
69.4 |
- |
77.7 |
- |
|
8,000 |
64.3 |
32.3 |
46.4 |
- |
|
32,000 |
62.5 |
26.2 |
52.1 |
- |
|
GPT 3.5 Turbo* |
4,097 |
70 |
48.1 |
57.1 |
- |
Previous model versions¶
The Snowflake Cortex AI_COMPLETE and COMPLETE functions also supports the following older model versions. We recommend using the latest model versions instead of the versions listed in this table.
Model |
Context Window
(Tokens)
|
MMLU
(Reasoning)
|
HumanEval
(Coding)
|
GSM8K
(Arithmetic Reasoning)
|
Spider 1.0
(SQL)
|
---|---|---|---|---|---|
32,000 |
81.2 |
45.1 |
81 |
81 |
|
4,096 |
68.9 |
30.5 |
57.5 |
- |
Using Snowflake Cortex AISQL with Python¶
Call Cortex AISQL functions in Snowpark Python¶
You can use Snowflake Cortex AISQL functions in the Snowpark Python API. These functions include the following. Note that the functions in Snowpark Python have names in Pythonic “snake_case” format, with words separated by underscores and all letters in lowercase.
ai_agg
example¶
The ai_agg
function aggregates a column of text using natural language instructions in a similar manner to how you would ask an analyst to summarize or extract findings from grouped or ungrouped data.
The following example summarizes customer reviews for each product using the ai_agg
function. The function takes a column of text and a natural language instruction to summarize the reviews.
from snowflake.snowpark.functions import ai_agg, col
df = session.create_dataframe([
[1, "Excellent product!"],
[1, "Great battery life."],
[1, "A bit expensive but worth it."],
[2, "Terrible customer service."],
[2, "Won’t buy again."],
], schema=["product_id", "review"])
# Summarize reviews per product
summary_df = df.group_by("product_id").agg(
ai_agg(col("review"), "Summarize the customer reviews in one sentence.")
)
summary_df.show()
Note
Use task descriptions that are detailed and centered around the use case. For example, “Summarize the customer feedback for an investor report”.
Classify text with ai_classify
¶
The ai_classify
function takes a string or image and classifies it into the categories that you define.
The following example classifies travel reviews into categories such as “travel” and “cooking”. The function takes a column of text and a list of categories to classify the text into.
from snowflake.snowpark.functions import ai_classify, col
df = session.create_dataframe([
["I dream of backpacking across South America."],
["I made the best pasta yesterday."],
], schema=["sentence"])
df = df.select(
"sentence",
ai_classify(col("sentence"), ["travel", "cooking"]).alias("classification")
)
df.show()
Note
You can provide up to 500 categories. You can classify both text and images.
Filter rows with ai_filter
¶
The ai_filter
function evaluates a natural language condition and returns True
or False
. You can use it to filter or tag rows.
from snowflake.snowpark.functions import ai_filter, prompt, col
df = session.create_dataframe(["Canada", "Germany", "Japan"], schema=["country"])
filtered_df = df.select(
"country",
ai_filter(prompt("Is {0} in Asia?", col("country"))).alias("is_in_asia")
)
filtered_df.show()
Note
You can filter on both strings and files. For dynamic prompts, use the :code:prompt
function.
For more information, see
Snowpark Python reference.
Call Cortex AISQL functions in Snowflake ML¶
Snowflake ML contains the older AISQL functions, those with names that don’t begin with “AI”. These functions are supported in version 1.1.2 and later of Snowflake ML. The names are rendered in Pythonic “snake_case” format, with words separated by underscores and all letters in lowercase.
If you run your Python script outside of Snowflake, you must create a Snowpark session to use these functions. See Connecting to Snowflake for instructions.
Process single values¶
The following Python example illustrates calling Snowflake Cortex AI functions on single values:
from snowflake.cortex import complete, extract_answer, sentiment, summarize, translate
text = """
The Snowflake company was co-founded by Thierry Cruanes, Marcin Zukowski,
and Benoit Dageville in 2012 and is headquartered in Bozeman, Montana.
"""
print(complete("llama2-70b-chat", "how do snowflakes get their unique patterns?"))
print(extract_answer(text, "When was snowflake founded?"))
print(sentiment("I really enjoyed this restaurant. Fantastic service!"))
print(summarize(text))
print(translate(text, "en", "fr"))
Pass hyperparameter options¶
You can pass options that affect the model’s hyperparameters when using the complete
function. The following
Python example illustrates modifying the maximum number of output tokens that the model can generate:
from snowflake.cortex import complete, CompleteOptions
model_options1 = CompleteOptions(
{'max_tokens':30}
)
print(complete("llama3.1-8b", "how do snowflakes get their unique patterns?", options=model_options1))
Call functions on table columns¶
You can call an AI function on a table column, as shown below. This example requires a session object (stored in
session
) and a table articles
containing a text column abstract_text
, and creates a new column
abstract_summary
containing a summary of the abstract.
from snowflake.cortex import summarize
from snowflake.snowpark.functions import col
article_df = session.table("articles")
article_df = article_df.withColumn(
"abstract_summary",
summarize(col("abstract_text"))
)
article_df.collect()
Note
The advanced chat-style (multi-message) form of COMPLETE is not currently supported in Snowflake ML Python.
Using Snowflake Cortex AI functions with Snowflake CLI¶
Snowflake Cortex AISQL is available in Snowflake CLI version 2.4.0 and later. See Introducing Snowflake CLI for more information about using Snowflake CLI. The functions are the old-style AISQL functions, those with names that don’t begin with “AI”.
The following examples illustrate using the snow cortex
commands on single values. The -c
parameter specifies which connection to use.
Note
The advanced chat-style (multi-message) form of COMPLETE is not currently supported in Snowflake CLI.
snow cortex complete "Is 5 more than 4? Please answer using one word without a period." -c "snowhouse"
snow cortex extract-answer "what is snowflake?" "snowflake is a company" -c "snowhouse"
snow cortex sentiment "Mary had a little Lamb" -c "snowhouse"
snow cortex summarize "John has a car. John's car is blue. John's car is old and John is thinking about buying a new car. There are a lot of cars to choose from and John cannot sleep because it's an important decision for John."
snow cortex translate herb --to pl
You can also use files that contain the text you want to use for the commands. For this example, assume that the file about_cortex.txt
contains the following content:
Snowflake Cortex gives you instant access to industry-leading large language models (LLMs) trained by researchers at companies like Anthropic, Mistral, Reka, Meta, and Google, including Snowflake Arctic, an open enterprise-grade model developed by Snowflake.
Since these LLMs are fully hosted and managed by Snowflake, using them requires no setup. Your data stays within Snowflake, giving you the performance, scalability, and governance you expect.
Snowflake Cortex features are provided as SQL functions and are also available in Python. The available functions are summarized below.
COMPLETE: Given a prompt, returns a response that completes the prompt. This function accepts either a single prompt or a conversation with multiple prompts and responses.
EMBED_TEXT_768: Given a piece of text, returns a vector embedding that represents that text.
EXTRACT_ANSWER: Given a question and unstructured data, returns the answer to the question if it can be found in the data.
SENTIMENT: Returns a sentiment score, from -1 to 1, representing the detected positive or negative sentiment of the given text.
SUMMARIZE: Returns a summary of the given text.
TRANSLATE: Translates given text from any supported language to any other.
You can then execute the snow cortex summarize
command by passing in the filename using the --file
parameter, as shown:
snow cortex summarize --file about_cortex.txt
Snowflake Cortex offers instant access to industry-leading language models, including Snowflake Arctic, with SQL functions for completing prompts (COMPLETE), text embedding (EMBED\_TEXT\_768), extracting answers (EXTRACT\_ANSWER), sentiment analysis (SENTIMENT), summarizing text (SUMMARIZE), and translating text (TRANSLATE).
For more information about these commands, see snow cortex commands.
Legal notices¶
The data classification of inputs and outputs are as set forth in the following table.
Input data classification |
Output data classification |
Designation |
---|---|---|
Usage Data |
Customer Data |
Generally available functions are Covered AI Features. Preview functions are Preview AI Features. [1] |
For additional information, refer to Snowflake AI and ML.