Snowflake Cortex AI Functions (including LLM functions)¶

Use Cortex AI Functions in Snowflake to run unstructured analytics on text and images with industry-leading LLMs from OpenAI, Anthropic, Meta, Mistral AI, and DeepSeek. AI Functions support use cases such as:

Extracting entities to enrich metadata and streamline validation
Aggregating insights across customer tickets
Filtering and classifying content by natural language
Sentiment and aspect-based analysis for service improvement
Translating and localizing multilingual content
Parsing documents for analytics and RAG pipelines

All models are fully hosted in Snowflake, ensuring performance, scalability, and governance while keeping your data secure and in place.

Available functions¶

Snowflake Cortex features are provided as SQL functions and are also available in Python. Cortex AI Functions can be grouped into the following categories:

Cortex AI functions
Helper functions

Cortex AI functions¶

These task-specific functions are purpose-built managed functions that automate routine tasks, like simple summaries and quick translations, that don’t require any customization.

Important

The following features are in preview and should not be used in production:

AI_AGG
AI_FILTER
AI_SUMMARIZE_AGG

AI_COMPLETE: Generates a completion for a given text string or image using a selected LLM. Use this function for most generative AI tasks.
- AI_COMPLETE is the updated version of COMPLETE (SNOWFLAKE.CORTEX).
AI_CLASSIFY: Classifies text or images into user-defined categories.
- AI_CLASSIFY is the updated version of CLASSIFY_TEXT (SNOWFLAKE.CORTEX) with support for multi-label and image classification.
AI_FILTER: Returns True or False for a given text or image input, allowing you to filter results in SELECT, WHERE, or JOIN ... ON clauses.
AI_AGG: Aggregates a text column and returns insights across multiple rows based on a user-defined prompt. This function isn’t subject to context window limitations.
AI_EMBED: Generates an embedding vector for a text or image input, which can be used for similarity search, clustering, and classification tasks.
- AI_EMBED is the updated version of EMBED_TEXT_1024 (SNOWFLAKE.CORTEX).
AI_EXTRACT: Extracts information from an input string or file, for example, text, images, and documents. Supports multiple languages.
- AI_EXTRACT is the updated version of EXTRACT_ANSWER (SNOWFLAKE.CORTEX).
AI_REDACT: Redacts personally identifiable information (PII) from text.
AI_SENTIMENT: Extracts sentiment from text.
- AI_SENTIMENT is the updated version of SENTIMENT (SNOWFLAKE.CORTEX).
AI_SUMMARIZE_AGG: Aggregates a text column and returns a summary across multiple rows. This function isn’t subject to context window limitations.
AI_SIMILARITY: Calculates the embedding similarity between two inputs.
AI_TRANSCRIBE: Transcribes audio and video files stored in a stage, extracting text, timestamps, and speaker information.
AI_PARSE_DOCUMENT: Extracts text (using OCR mode) or text with layout information (using LAYOUT mode) from documents in an internal or external stage.
- AI_PARSE_DOCUMENT is the updated version of PARSE_DOCUMENT (SNOWFLAKE.CORTEX).
AI_REDACT: Redact personally identifiable information (PII) from text.
AI_TRANSLATE: Translates text between supported languages.
- AI_TRANSLATE is the updated version of TRANSLATE (SNOWFLAKE.CORTEX).
SUMMARIZE (SNOWFLAKE.CORTEX): Returns a summary of the text that you’ve specified.

Helper functions¶

Helper functions are purpose-built managed functions that reduce cases of failures when running other Cortex AI Functions, for example by getting the count of tokens in an input prompt to ensure the call doesn’t exceed a model limit.

TO_FILE: Creates a reference to a file in an internal or external stage for use with AI_COMPLETE and other functions that accept files.
AI_COUNT_TOKENS: Given an input text, returns the token count based on the model or Cortex function specified.
- AI_COUNT_TOKENS is the updated version of COUNT_TOKENS (SNOWFLAKE.CORTEX).
PROMPT: Helps you build prompt objects for use with AI_COMPLETE and other functions.
TRY_COMPLETE (SNOWFLAKE.CORTEX): Works like the COMPLETE function, but returns NULL when the function could not execute instead of an error code.

Cortex Guard¶

Cortex Guard is an option of the AI_COMPLETE (or SNOWFLAKE.CORTEX.COMPLETE) function designed to filter possible unsafe and harmful responses from a language model. Cortex Guard is currently built with Meta’s Llama Guard 3. Cortex Guard works by evaluating the responses of a language model before that output is returned to the application. Once you activate Cortex Guard, language model responses which may be associated with violent crimes, hate, sexual content, self-harm, and more are automatically filtered. See COMPLETE arguments for syntax and examples.

Note

Usage of Cortex Guard incurs compute charges based on the number of input tokens processed, in addition to the charges for the AI_COMPLETE function.

Performance considerations¶

Cortex AI Functions are optimized for throughput. We recommend using these functions to process numerous inputs such as text from large SQL tables. Batch processing is typically better suited for AI Functions. For more interactive use cases where latency is important, use the REST API. These are available for simple inference (Complete API), embedding (Embed API) and agentic applications (Agents API).

Cortex LLM privileges¶

CORTEX_USER database role¶

The CORTEX_USER database role in the SNOWFLAKE database includes the privileges that allow users to call Snowflake Cortex AI Functions. By default, the CORTEX_USER role is granted to the PUBLIC role. The PUBLIC role is automatically granted to all users and roles, so this allows all users in your account to use the Snowflake Cortex AI functions.

If you don’t want all users to have this privilege, you can revoke access to the PUBLIC role and grant access to other roles. The SNOWFLAKE.CORTEX_USER database role cannot be granted directly to a user. For more information, see Using SNOWFLAKE database roles.

To revoke the CORTEX_USER database role from the PUBLIC role, run the following commands using the ACCOUNTADMIN role:

REVOKE DATABASE ROLE SNOWFLAKE.CORTEX_USER
  FROM ROLE PUBLIC;

REVOKE IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE
  FROM ROLE PUBLIC;

Copy

You can then selectively provide access to specific roles. A user with the ACCOUNTADMIN role can grant this role to a custom role in order to allow users to access Cortex AI functions. In the following example, use the ACCOUNTADMIN role and grant the user some_user the CORTEX_USER database role via the account role cortex_user_role, which you create for this purpose.

USE ROLE ACCOUNTADMIN;

CREATE ROLE cortex_user_role;
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE cortex_user_role;

GRANT ROLE cortex_user_role TO USER some_user;

Copy

You can also grant access to Snowflake Cortex AI functions through existing roles commonly used by specific groups of users. (See User roles.) For example, if you have created an analyst role that is used as a default role by analysts in your organization, you can easily grant these users access to Snowflake Cortex AI Functions with a single GRANT statement.

GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE analyst;

Copy

CORTEX_EMBED_USER database role¶

The CORTEX_EMBED_USER database role in the SNOWFLAKE database includes the privileges that allow users to call the text embedding functions AI_EMBED, EMBED_TEXT_768, and EMBED_TEXT_1024 and to create Cortex Search Services with managed vector embeddings. CORTEX_EMBED_USER allows you to grant embedding privileges separately from other Cortex AI capabilities.

Note

You can create Cortex Search Services with user-provided embeddings without the CORTEX_EMBED_USER role. In that case, you must generate the embeddings yourself, outside of Snowflake, and load them into a table.

Unlike the CORTEX_USER role, the CORTEX_EMBED_USER role is not granted to the PUBLIC role by default. You must explicitly grant this role to roles that require embedding capabilities if you have revoked the CORTEx_USER role. The CORTEX_EMBED_USER database role cannot be granted directly to users but must be granted to roles that users can assume. The following example illustrates this process.

USE ROLE ACCOUNTADMIN;

CREATE ROLE cortex_embed_user_role;
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_EMBED_USER TO ROLE cortex_embed_user_role;

GRANT ROLE cortex_embed_user_role TO USER some_user;

Copy

Alternatively, to give all users access to embedding capabilities, grant the CORTEX_EMBED_USER role to the PUBLIC role as follows.

USE ROLE ACCOUNTADMIN;

GRANT DATABASE ROLE SNOWFLAKE.CORTEX_EMBED_USER TO ROLE PUBLIC;

Copy

Using AI Functions in stored procedures with EXECUTE AS RESTRICTED CALLER¶

To use AI Functions inside stored procedures with EXECUTE AS RESTRICTED CALLER, grant the following privileges to the role that created the stored procedure:

GRANT INHERITED CALLER USAGE ON ALL SCHEMAS IN DATABASE snowflake TO ROLE <role_that_created_the_stored_procedure>;
GRANT INHERITED CALLER USAGE ON ALL FUNCTIONS IN DATABASE snowflake TO ROLE <role_that_created_the_stored_procedure>;
GRANT CALLER USAGE ON DATABASE snowflake TO ROLE <role_that_created_the_stored_procedure>;

Copy

Control model access¶

Snowflake Cortex provides two independent mechanisms to enforce access to models:

Account-level allowlist parameter (simple, broad control)
Role-based access control (RBAC) (fine-grained control)

You can use the account-level allowlist to control model access across your entire account, or you can use RBAC to control model access on a per-role basis. For maximum flexibility, you can also use both mechanisms together, if you can accept additional management complexity.

Account-level allowlist parameter¶

You can control model access across your entire account using the CORTEX_MODELS_ALLOWLIST parameter. Supported features will respect the value of this parameter and prevent use of models that are not in the allowlist.

The CORTEX_MODELS_ALLOWLIST parameter can be set to 'All', 'None', or to a comma-separated list of model names. This parameter can only be set at the account level, not at the user or session levels. Only the ACCOUNTADMIN role can set the parameter using the ALTER ACCOUNT command.

Examples:

To allow access to all models:

ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'All';

Copy

To allow access to the mistral-large2 and llama3.1-70b models:

ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'mistral-large2,llama3.1-70b';

Copy

To prevent access to any model:

ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'None';

Copy

Use RBAC, as described in the following section, to provide specific roles with access beyond what you’ve specified in the allowlist.

Role-based access control (RBAC)¶

Although Cortex models are not themselves Snowflake objects, Snowflake lets you create model objects in the SNOWFLAKE.MODELS schema that represent the Cortex models. By applying RBAC to these objects, you can control access to models the same way you would any other Snowflake object. Supported features accept the identifiers of objects in SNOWFLAKE.MODELS wherever a model can be specified.

Tip

To use RBAC exclusively, set CORTEX_MODELS_ALLOWLIST to 'None'.

Refresh model objects and application roles¶

SNOWFLAKE.MODELS is not automatically populated with the objects that represent Cortex models. You must create these objects when you first set up model RBAC, and refresh them when you want to apply RBAC to new models.

As ACCOUNTADMIN, run the SNOWFLAKE.MODELS.CORTEX_BASE_MODELS_REFRESH stored procedure to populate the SNOWFLAKE.MODELS schema with objects representing currently available Cortex models, and to create application roles that correspond to the models. The procedure also creates CORTEX-MODEL-ROLE-ALL, a role that covers all models.

Tip

You can safely call CORTEX_BASE_MODELS_REFRESH at any time; it will not create duplicate objects or roles.

CALL SNOWFLAKE.MODELS.CORTEX_BASE_MODELS_REFRESH();

Copy

After refreshing the model objects, you can verify that the models appear in the SNOWFLAKE.MODELS schema as follows:

SHOW MODELS IN SNOWFLAKE.MODELS;

Copy

The returned list of models resembles the following:

created_on	name	model_type	database_name	schema_name	owner
2025-04-22 09:35:38.558 -0700	CLAUDE-3-5-SONNET	CORTEX_BASE	SNOWFLAKE	MODELS	SNOWFLAKE
2025-04-22 09:36:16.793 -0700	LLAMA3.1-405B	CORTEX_BASE	SNOWFLAKE	MODELS	SNOWFLAKE
2025-04-22 09:37:18.692 -0700	SNOWFLAKE-ARCTIC	CORTEX_BASE	SNOWFLAKE	MODELS	SNOWFLAKE

To verify that you can see the application roles associated with these models, use the SHOW APPLICATION ROLES command, as in the following example:

SHOW APPLICATION ROLES IN APPLICATION SNOWFLAKE;

Copy

The list of application roles resembles the following:

created_on	name	owner	comment	owner_role_type
2025-04-22 09:35:38.558 -0700	CORTEX-MODEL-ROLE-ALL	SNOWFLAKE	MODELS	APPLICATION
2025-04-22 09:36:16.793 -0700	CORTEX-MODEL-ROLE-LLAMA3.1-405B	SNOWFLAKE	MODELS	APPLICATION
2025-04-22 09:37:18.692 -0700	CORTEX-MODEL-ROLE-SNOWFLAKE-ARCTIC	SNOWFLAKE	MODELS	APPLICATION

Grant application roles to user roles¶

After you create the model objects and application roles, you can grant the application roles to specific user roles in your account.

To grant a role access to a specific model:

GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-LLAMA3.1-70B" TO ROLE MY_ROLE;

Copy

To grant a role access to all current and future models:

GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-ALL" TO ROLE MY_ROLE;

Copy

Use model objects with supported features¶

To use model objects with supported Cortex features, specify the identifier of the model object in SNOWFLAKE.MODELS as the model argument. You can use a fully-qualified identifier, a partial identifier, or a simple model name that will be automatically resolved to SNOWFLAKE.MODELS.

Using a fully-qualified identifier:

SELECT AI_COMPLETE('SNOWFLAKE.MODELS."LLAMA3.1-70B"', 'Hello');

Copy

Using a partial identifier:

USE DATABASE SNOWFLAKE;
USE SCHEMA MODELS;
SELECT AI_COMPLETE('LLAMA3.1-70B', 'Hello');

Copy

Using automatic lookup with a simple model name:

-- Automatically resolves to SNOWFLAKE.MODELS."LLAMA3.1-70B"
SELECT AI_COMPLETE('llama3.1-70b', 'Hello');

Copy

Using RBAC with account-level allowlist¶

A number of Cortex features accept a model name as a string argument, for example AI_COMPLETE('model', 'prompt'). When you provide a model name:

Cortex first attempts to locate a matching model object in SNOWFLAKE.MODELS. If you provide an unqualified name like 'x', it automatically looks for SNOWFLAKE.MODELS."X".
If the model object is found, RBAC is applied to determine whether the user can use the model.
If no model object is found, the provided string is matched against the account-level allowlist.

The following example illustrates the use of allowlist and RBAC together. In this example, the allowlist is set to allow the mistral-large2 model, and the user has access to the LLAMA3.1-70B model object through RBAC.

-- set up access
USE SECONDARY ROLES NONE;
USE ROLE ACCOUNTADMIN;
ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'MISTRAL-LARGE2';
CALL SNOWFLAKE.MODELS.CORTEX_BASE_MODELS_REFRESH();
GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-LLAMA3.1-70B" TO ROLE PUBLIC;

-- test access
USE ROLE PUBLIC;

-- this succeeds because mistral-large2 is in the allowlist
SELECT AI_COMPLETE('MISTRAL-LARGE2', 'Hello');

-- this succeeds because the role has access to the model object
SELECT AI_COMPLETE('SNOWFLAKE.MODELS."LLAMA3.1-70B"', 'Hello');

-- this fails because the first argument is
-- neither an identifier for an accessible model object
-- nor is it a model name in the allowlist
SELECT AI_COMPLETE('SNOWFLAKE-ARCTIC', 'Hello');

Copy

Common pitfalls¶

Access to a model (whether by allowlist or RBAC) does not always mean that it can be used. It may still be subject to cross-region, deprecation, or other availability constraints. These restrictions can result in error messages that seem similar to model access errors.
Model access controls only govern use of a model, and not the use of a feature itself, which may have its own access controls. For example, access to AI_COMPLETE is governed by the CORTEX_USER database role. See Cortex LLM privileges for more information.
Not all features support model access controls. See the supported features table to see which access control methods a given feature supports.
Secondary roles can obscure permissions. For example, if a user has ACCOUNTADMIN as a secondary role, all model objects may appear accessible. Disable secondary roles temporarily when verifying permissions.
Qualified model object identifiers are quoted and therefore case-sensitive. See QUOTED_IDENTIFIERS_IGNORE_CASE for more information.

Supported features¶

Model access controls are supported by the following features:

Feature	Account-level allowlist	Role-based access control	Notes
AI_COMPLETE	✔	✔
AI_CLASSIFY	✔	✔	If the model powering this function is not allowed, the error message contains information about how to modify the allowlist.
AI_FILTER	✔	✔	If the model powering this function is not allowed, the error message contains information about how to modify the allowlist.
AI_AGG	✔	✔	If the model powering this function is not allowed, the error message contains information about how to modify the allowlist.
AI_SUMMARIZE_AGG	✔	✔	If the model powering this function is not allowed, the error message contains information about how to modify the allowlist.
COMPLETE (SNOWFLAKE.CORTEX)	✔	✔
TRY_COMPLETE (SNOWFLAKE.CORTEX)	✔	✔
Cortex REST API	✔	✔
Cortex Playground	✔	✔

Regional availability¶

Snowflake Cortex AI functions are available in the following regions. If your region is not listed for a particular function, use cross-region inference.

Note

The TRY_COMPLETE function is available in the same regions as COMPLETE.
The AI_COUNT_TOKENS function is available in all regions for any model, but the models themselves are available only in the regions specified in the tables below.

The following functions and models are available in any region via cross-region inference.

Function Model	Cross Cloud (Any Region)	AWS US (Cross-Region)	AWS US Commercial Gov (Cross-Region)	AWS EU (Cross-Region)	AWS APJ (Cross-Region)	Azure US (Cross-Region)
AI_COMPLETE
`claude-sonnet-4-5`	*	*	*	*
`claude-haiku-4-5`	*		*
`claude-4-sonnet`	✔	✔	✔	✔	✔
`claude-3-7-sonnet`	✔	✔	✔	✔
`claude-3-5-sonnet`	✔	✔
`llama4-maverick`	✔	✔
`llama4-scout`	✔	✔
`llama3.1-8b`	✔	✔	✔	✔	✔	✔
`llama3.1-70b`	✔	✔	✔	✔	✔	✔
`llama3.3-70b`	✔	✔
`snowflake-llama-3.3-70b`	✔	✔
`llama3.1-405b`	✔	✔	✔			✔
`openai-gpt-4.1`	✔					✔
`openai-gpt-5`	*					*
`openai-gpt-5-mini`	*					*
`openai-gpt-5-nano`	*					*
`openai-gpt-5-chat`	✔
`openai-gpt-oss-120b`	*
`openai-gpt-oss-20b`	*
`snowflake-llama-3.1-405b`	✔	✔	✔
`snowflake-arctic`	✔	✔				✔
`deepseek-r1`	✔	✔
`mistral-large2`	✔	✔	✔		✔	✔
`mixtral-8x7b`	✔	✔	✔	✔	✔	✔
`mistral-7b`	✔	✔	✔	✔	✔	✔

EMBED_TEXT_768
`e5-base-v2`	✔	✔	✔	✔	✔	✔
`snowflake-arctic-embed-m`	✔	✔	✔	✔	✔	✔
`snowflake-arctic-embed-m-v1.5`	✔	✔	✔	✔	✔	✔

EMBED_TEXT_1024
`snowflake-arctic-embed-l-v2.0`	✔	✔	✔	✔	✔	✔
`snowflake-arctic-embed-l-v2.0-8k`	✔	✔	✔	✔	✔	✔
`nv-embed-qa-4`	✔	✔
`multilingual-e5-large`	✔	✔	✔	✔	✔	✔
`voyage-multilingual-2`	✔	✔	✔	✔	✔	✔

AI_CLASSIFY TEXT	✔	✔		✔	✔	✔
AI_CLASSIFY IMAGE	✔
AI_EXTRACT	✔	✔		✔	✔	✔
AI_FILTER TEXT *	✔	✔		✔	✔	✔
AI_FILTER IMAGE *	✔
AI_AGG *	✔	✔		✔	✔	✔
AI_REDACT	✔	✔	✔	✔	✔	✔
AI_SENTIMENT	✔	✔		✔	✔	✔
AI_SIMILARITY TEXT	✔	✔		✔	✔	✔
AI_SIMILARITY IMAGE	✔	✔		✔
AI_SUMMARIZE_AGG *	✔	✔		✔	✔	✔
AI_TRANSCRIBE	✔	✔		✔		✔
SENTIMENT	✔	✔	✔	✔	✔	✔
ENTITY_SENTIMENT	✔	✔	✔	✔	✔	✔
EXTRACT_ANSWER	✔	✔	✔	✔	✔	✔
SUMMARIZE	✔	✔	✔	✔	✔	✔
TRANSLATE	✔	✔	✔	✔	✔	✔

The following functions and models are available natively in North American regions.

Function Model	AWS US West 2 (Oregon)	AWS US East 1 (N. Virginia)	AWS US East (Commercial Gov - N. Virginia)	Azure East US 2 (Virginia)	Azure West US (Washington)	Azure South Central US (Texas)
AI_COMPLETE
`claude-4-sonnet`
`claude-3-7-sonnet`
`claude-3-5-sonnet`	✔	✔
`llama4-maverick`	✔
`llama4-scout`	✔
`llama3.1-8b`	✔	✔	✔	✔
`llama3.1-70b`	✔	✔	✔	✔
`llama3.3-70b`	✔
`snowflake-llama-3.3-70b`	✔
`llama3.1-405b`	✔	✔	✔	✔
`openai-gpt-4.1`				✔
`openai-gpt-oss-120b`	*
`openai-gpt-oss-20b`	*			*
`snowflake-llama-3.1-405b`	✔
`snowflake-arctic`	✔			✔
`deepseek-r1`	✔
`mistral-large2`	✔	✔	✔	✔
`mixtral-8x7b`	✔	✔	✔	✔
`mistral-7b`	✔	✔	✔	✔

EMBED_TEXT_768
`e5-base-v2`	✔	✔	✔	✔
`snowflake-arctic-embed-m`	✔	✔	✔	✔
`snowflake-arctic-embed-m-v1.5`	✔	✔	✔	✔

EMBED_TEXT_1024
`snowflake-arctic-embed-l-v2.0`	✔	✔	✔	✔
`snowflake-arctic-embed-l-v2.0-8k`	✔	✔	✔	✔
`nv-embed-qa-4`	✔
`multilingual-e5-large`	✔	✔	✔	✔
`voyage-multilingual-2`	✔	✔	✔	✔

AI_CLASSIFY TEXT	✔	✔		✔
AI_CLASSIFY IMAGE	✔	✔
AI_EXTRACT	✔	✔		✔	✔	✔
AI_FILTER TEXT *	✔	✔		✔
AI_FILTER IMAGE *	✔	✔
AI_AGG *	✔	✔		✔
AI_REDACT	✔	✔	✔	✔
AI_SIMILARITY TEXT	✔	✔		✔
AI_SIMILARITY IMAGE	✔	✔
AI_SUMMARIZE_AGG *	✔	✔		✔
AI_TRANSCRIBE	✔	✔		✔
SENTIMENT	✔	✔	✔	✔
ENTITY_SENTIMENT	✔	✔	✔	✔
EXTRACT_ANSWER	✔	✔	✔	✔
SUMMARIZE	✔	✔	✔	✔
TRANSLATE	✔	✔	✔	✔

The following functions and models are available natively in European regions.

Function Model	AWS Europe Central 1 (Frankfurt)	AWS Europe West 1 (Ireland)	Azure West Europe (Netherlands)
AI_COMPLETE
`claude-4-sonnet`
`claude-3-7-sonnet`
`claude-3-5-sonnet`
`llama4-maverick`
`llama4-scout`
`llama3.1-8b`	✔	✔	✔
`llama3.1-70b`	✔	✔	✔
`llama3.3-70b`
`snowflake-llama-3.3-70b`
`llama3.1-405b`
`openai-gpt-4.1`
`openai-gpt-oss-120b`
`openai-gpt-oss-20b`
`snowflake-llama-3.1-405b`
`snowflake-arctic`
`deepseek-r1`
`mistral-large2`	✔	✔	✔
`mixtral-8x7b`	✔	✔	✔
`mistral-7b`	✔	✔	✔

EMBED_TEXT_768
`e5-base-v2`	✔		✔
`snowflake-arctic-embed-m`	✔	✔	✔
`snowflake-arctic-embed-m-v1.5`	✔	✔	✔

EMBED_TEXT_1024
`snowflake-arctic-embed-l-v2.0`	✔	✔	✔
`snowflake-arctic-embed-l-v2.0-8k`	✔	✔	✔
`nv-embed-qa-4`
`multilingual-e5-large`	✔	✔	✔
`voyage-multilingual-2`	✔	✔	✔

AI_CLASSIFY TEXT	✔	✔	✔
AI_CLASSIFY IMAGE	✔
AI_EXTRACT	✔	✔	✔
AI_FILTER TEXT *	✔	✔	✔
AI_FILTER IMAGE *	✔
AI_AGG *	✔	✔	✔
AI_REDACT	✔	✔	✔
AI_SIMILARITY TEXT	✔	✔	✔
AI_SIMILARITY IMAGE	✔
AI_SUMMARIZE_AGG *	✔	✔	✔
AI_TRANSCRIBE	✔
SENTIMENT	✔	✔	✔
ENTITY_SENTIMENT	✔		✔
EXTRACT_ANSWER	✔	✔	✔
SUMMARIZE	✔	✔	✔
TRANSLATE	✔	✔	✔

The following functions and models are available natively in Asia-Pacific regions:

Function \| Model	AWS AP Southeast 2 (Sydney)	AWS AP Northeast 1 (Tokyo)
AI_COMPLETE
`claude-4-sonnet`
`claude-3-7-sonnet`
`claude-3-5-sonnet`	✔
`llama4-maverick`
`llama4-scout`
`llama3.1-8b`	✔	✔
`llama3.1-70b`	✔	✔
`llama3.3-70b`
`snowflake-llama-3.3-70b`
`llama3.1-405b`
`openai-gpt-4.1`
`snowflake-llama-3.1-405b`
`snowflake-arctic`
`deepseek-r1`
`mistral-large2`	✔	✔
`mixtral-8x7b`	✔	✔
`mistral-7b`	✔	✔

EMBED_TEXT_768
`e5-base-v2`	✔	✔
`snowflake-arctic-embed-m`	✔	✔
`snowflake-arctic-embed-m-v1.5`	✔	✔

EMBED_TEXT_1024
`snowflake-arctic-embed-l-v2.0`	✔	✔
`snowflake-arctic-embed-l-v2.0-8k`	✔	✔
`nv-embed-qa-4`
`multilingual-e5-large`	✔	✔
`voyage-multilingual-2`	✔	✔

AI_EXTRACT	✔	✔
AI_CLASSIFY TEXT	✔	✔
AI_CLASSIFY IMAGE
AI_FILTER TEXT *	✔	✔
AI_FILTER IMAGE *
AI_AGG *	✔	✔
AI_SIMILARITY TEXT	✔	✔
AI_SIMILARITY IMAGE
AI_SUMMARIZE_AGG *	✔	✔
AI_TRANSCRIBE
EXTRACT_ANSWER	✔	✔
SENTIMENT	✔	✔
ENTITY_SENTIMENT		✔
SUMMARIZE	✔	✔
TRANSLATE	✔	✔

* Indicates a preview function or model. Preview features are not suitable for production workloads.

The following Snowflake Cortex AI functions and models are available in the following extended regions.

Function Model	AWS US East 2 (Ohio)	AWS CA Central 1 (Central)	AWS SA East 1 (São Paulo)	AWS Europe West 2 (London)	AWS Europe Central 1 (Frankfurt)	AWS Europe North 1 (Stockholm)	AWS AP Northeast 1 (Tokyo)	AWS AP South 1 (Mumbai)	AWS AP Southeast 2 (Sydney)	AWS AP Southeast 3 (Jakarta)	Azure South Central US (Texas)	Azure West US 2 (Washington)	Azure UK South (London)	Azure North Europe (Ireland)	Azure Switzerland North (Zürich)	Azure Central India (Pune)	Azure Japan East (Tokyo, Saitama)	Azure Southeast Asia (Singapore)	Azure Australia East (New South Wales)	Google Cloud Europe West 2 (London)	Google Cloud Europe West 4 (Netherlands)	Google Cloud US Central 1 (Iowa)	Google Cloud US East 4 (N. Virginia)
EMBED_TEXT_768
`snowflake-arctic-embed-m-v1.5`	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔
`snowflake-arctic-embed-m`	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔
EMBED_TEXT_1024
`multilingual-e5-large`	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔
AI_EXTRACT	✔	✔	✔	✔	✔	Cross-region only	✔	Cross-region only	✔	Cross-region only	✔	✔	Cross-region only	✔	Cross-region only	✔	✔	✔	✔	Cross-region only	Cross-region only	Cross-region only	Cross-region only

The following table lists availability of legacy models. These models have not been deprecated and can still be used. However, Snowflake recommends newer models for new development.

Legacy¶
Function (Model)	AWS US West 2 (Oregon)	AWS US East 1 (N. Virginia)	AWS Europe Central 1 (Frankfurt)	AWS AP Southeast 2 (Sydney)	AWS AP Northeast 1 (Tokyo)	Azure East US 2 (Virginia)	Azure West Europe (Netherlands)
AI_COMPLETE
`llama3-8b`	✔	✔	✔	✔	✔	✔
`llama3-70b`	✔	✔	✔		✔	✔
`mistral-large`	✔	✔	✔			✔	✔
`openai-o4-mini`						✔

Create stage for media files¶

Cortex AI Functions that process media files (documents, images, audio, or video) require the files to be stored on an internal or external stage. The stage must use server-side encryption. If you want to be able to query the stage or programmatically process all the files stored there, the stage must have a directory table.

The SQL below creates a suitable internal stage:

CREATE OR REPLACE STAGE input_stage
  DIRECTORY = ( ENABLE = true )
  ENCRYPTION = ( TYPE = 'SNOWFLAKE_SSE' );

Copy

To process files from external object storage (e.g., Amazon S3), create a storage integration, then create an external stage that uses the storage integration. To learn how to configure a Snowflake Storage Integration, see our detailed guides:

Create an external stage that references the integration and points to your cloud storage container. This example points to an Amazon S3 bucket:

CREATE OR REPLACE STAGE my_aisql_media_files
  STORAGE_INTEGRATION = my_s3_integration
  URL = 's3://my_bucket/prefix/'
  DIRECTORY = ( ENABLE = TRUE )
  ENCRYPTION = ( TYPE = 'AWS_SSE_S3' );

Copy

With an internal or external stage created, and files stored there, you can use Cortex AI Functions to process media files stored in the stage. For more information, see:

Note

AI Functions are currently incompatible with custom network policies.

Cortex AI Functions storage best practices¶

You may find the following best practices helpful when working with media files in stages with Cortex AI Functions:

Establish a scheme for organizing media files in stages. For example, create a separate stage for each team or project, and store the different types of media files in subdirectories.
Enable directory listings on stages to allow querying and programmatic access to its files.

Tip

To automatically refresh the directory table for the external stage when new or updated files are available, set AUTO_REFRESH = TRUE when creating the stage.
For external stages, use fine-grained policies on the cloud provider side (for example, AWS IAM policies) to restrict the storage integration’s access to only what is necessary.
Always use encryption, such as AWS_SSE or SNOWFLAKE_SSE, to protect your data at rest.

Cost considerations¶

Snowflake Cortex AI functions incur compute cost based on the number of tokens processed. Refer to the Snowflake Service Consumption Table for each function’s cost in credits per million tokens.

A token is the smallest unit of text processed by Snowflake Cortex AI functions. An industry convention for text is that a token is approximately equal to four characters, although this can vary by model, as can token equivalence for media files.

For functions that generate new text using provided text (AI_COMPLETE, AI_CLASSIFY, AI_FILTER, AI_AGG, AI_SUMMARIZE, and AI_TRANSLATE, and their previous versions in the SNOWFLAKE.CORTEX schema), both input and output tokens are billable.
For Cortex Guard, only input tokens are counted. The number of input tokens is based on the number of tokens output from AI_COMPLETE (or COMPLETE). Cortex Guard usage is billed in addition to the cost of the AI_COMPLETE (or COMPLETE) function.
For AI_SIMILARITY, AI_EMBED, and the SNOWFLAKE.CORTEX.EMBED_* functions, only input tokens are counted.
For EXTRACT_ANSWER, the number of billable tokens is the sum of the number of tokens in the from_text and question fields.
AI_CLASSIFY, AI_FILTER, AI_AGG, AI_SENTIMENT, AI_SUMMARIZE_AGG, SUMMARIZE, TRANSLATE, AI_TRANSLATE, EXTRACT_ANSWER, ENTITY_SENTIMENT, and SENTIMENT add a prompt to the input text in order to generate the response. As a result, the billed token count is higher than the number of tokens in the text you provide.
AI_CLASSIFY labels, descriptions, and examples are counted as input tokens for each record processed, not just once for each AI_CLASSIFY call.
For AI_PARSE_DOCUMENT (or SNOWFLAKE.CORTEX.PARSE_DOCUMENT), billing is based on the number of document pages processed.
TRY_COMPLETE (SNOWFLAKE.CORTEX) does not incur costs for error handling. If the TRY_COMPLETE(SNOWFLAKE.CORTEX) function returns NULL, no cost is incurred.
For AI_EXTRACT, both input and output tokens are counted. The responseFormat argument is counted as input tokens. For document formats consisting of pages, the number of pages processed is counted as input tokens. Each page in a document is counted as 970 tokens.
AI_COUNT_TOKENS incurs only compute cost to run the function. No additional token-based costs are incurred.

For models that support media files such as images or audio:

Audio files are billed at 50 tokens per second of audio.
The token equivalence of images is determined by the model used. For more information, see AI Image cost considerations.

Snowflake recommends executing queries that call a Snowflake Cortex AI Function with a smaller warehouse (no larger than MEDIUM). Larger warehouses do not increase performance. The cost associated with keeping a warehouse active continues to apply when executing a query that calls a Snowflake Cortex LLM Function. For general information on compute costs, see Understanding compute cost.

Warehouse sizing¶

Snowflake recommends using a warehouse size no larger than MEDIUM when calling Snowflake Cortex AI Functions. Using a larger warehouse than necessary does not increase performance, but can result in unnecessary costs. This recommendation may change in the future as we continue to evolve Cortex AI Functions.

Track costs for AI services¶

To track credits used for AI Services including LLM Functions in your account, use the METERING_HISTORY view:

SELECT *
  FROM SNOWFLAKE.ACCOUNT_USAGE.METERING_DAILY_HISTORY
  WHERE SERVICE_TYPE='AI_SERVICES';

Copy

Track credit consumption for Cortex AI Functions¶

To view the credit and token consumption for each AI Function call, use the CORTEX_FUNCTIONS_USAGE_HISTORY view:

SELECT *
  FROM SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_USAGE_HISTORY;

Copy

You can also view the credit and token consumption for each query within your Snowflake account. Viewing the credit and token consumption for each query helps you identify queries that are consuming the most credits and tokens.

The following example query uses the CORTEX_FUNCTIONS_QUERY_USAGE_HISTORY view to show the credit and token consumption for all of your queries within your account.

SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_QUERY_USAGE_HISTORY;

Copy

You can also use the same view to see the credit and token consumption for a specific query.

SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_QUERY_USAGE_HISTORY
WHERE query_id='<query-id>';

Copy

Note

You can’t get granular usage information for requests made with the REST API.

The query usage history is grouped by the models used in the query. For example, if you ran:

SELECT AI_COMPLETE('mistral-7b', 'Is a hot dog a sandwich'), AI_COMPLETE('mistral-large', 'Is a hot dog a sandwich');

Copy

The query usage history would show two rows, one for mistral-7b and one for mistral-large.

Usage quotas¶

On-demand Snowflake accounts without a valid payment method (such as trial accounts) are limited to 10 credits per day for Snowflake Cortex AI Functions usage. To remove this limit, convert your trial account to a paid account.

Model restrictions¶

Models used by Snowflake Cortex have limitations on size as described in the table below. Sizes are given in tokens. Tokens generally represent about four characters of text, so the number of words corresponding to a limit is less than the number of tokens. Inputs that exceed the limit result in an error.

The maximum size of the output that a model can produce is limited by the following:

The model’s output token limit.
The space available in the context window after the model consumes the input tokens.

For example, claude-3-5-sonnet has a context window of 200,000 tokens. If 100,000 tokens are used for the input, the model can generate up to 8,192 tokens. However, if 195,000 tokens are used as input, then the model can only generate up to 5,000 tokens for a total of 200,000 tokens.

Important

In the AWS AP Southeast 2 (Sydney) region:

the context window for llama3-8b and mistral-7b is 4,096 tokens.
the context window for llama3.1-8b is 16,384 tokens.
the context window for the Snowflake managed model from the SUMMARIZE function is 4,096 tokens.

In the AWS Europe West 1 (Ireland) region:

the context window for llama3.1-8b is 16,384 tokens.
the context window for mistral-7b is 4,096 tokens.

Function	Model	Context window (tokens)	Max output (tokens)
COMPLETE	`llama4-maverick`	128,000	8,192
	`llama4-scout`	128,000	8,192
	`snowflake-arctic`	4,096	8,192
	`deepseek-r1`	32,768	8,192
	`claude-sonnet-4-5`	200,000	64,000
	`claude-haiku-4-5`	200,000	64,000
	`claude-4-sonnet`	200,000	32,000
	`claude-3-7-sonnet`	200,000	32,000
	`claude-3-5-sonnet`	200,000	8,192
	`mistral-large`	32,000	8,192
	`mistral-large2`	128,000	8,192
	`openai-gpt-4.1`	128,000	32,000
	`openai-o4-mini`	200,000	32,000
	`openai-gpt-5`	272,000	8,192
	`openai-gpt-5-mini`	272,000	8,192
	`openai-gpt-5-nano`	272,000	8,192
	`openai-gpt-5-chat`	128,000	8,192
	`openai-gpt-oss-120b`	128,000	8,192
	`openai-gpt-oss-20b`	128,000	8,192
	`mixtral-8x7b`	32,000	8,192
	`llama3-8b`	8,000	8,192
	`llama3-70b`	8,000	8,192
	`llama3.1-8b`	128,000	8,192
	`llama3.1-70b`	128,000	8,192
	`llama3.3-70b`	128,000	8,192
	`snowflake-llama-3.3-70b`	128,000	8,192
	`llama3.1-405b`	128,000	8,192
	`snowflake-llama-3.1-405b`	8,000	8,192
	`mistral-7b`	32,000	8,192
EMBED_TEXT_768	`e5-base-v2`	512	n/a
	`snowflake-arctic-embed-m`	512	n/a
EMBED_TEXT_1024	`nv-embed-qa-4`	512	n/a
	`multilingual-e5-large`	512	n/a
	`voyage-multilingual-2`	32,000	n/a
AI_EXTRACT	`arctic-extract`	128,000	51,200
AI_FILTER	Snowflake managed model	128,000	n/a
AI_CLASSIFY	Snowflake managed model	128,000	n/a
AI_AGG	Snowflake managed model	128,000 per row can be used across multiple rows	8,192
AI_SENTIMENT	Snowflake managed model	2,048	n/a
AI_SUMMARIZE_AGG	Snowflake managed model	128,000 per row can be used across multiple rows	8,192
ENTITY_SENTIMENT	Snowflake managed model	2,048	n/a
EXTRACT_ANSWER	Snowflake managed model	2,048 for text 64 for question	n/a
SENTIMENT	Snowflake managed model	512	n/a
SUMMARIZE	Snowflake managed model	32,000	4,096
TRANSLATE	Snowflake managed model	4,096	n/a

Choosing a model¶

The Snowflake Cortex AI_COMPLETE function supports multiple models of varying capability, latency, and cost. These models have been carefully chosen to align with common customer use cases. To achieve the best performance per credit, choose a model that’s a good match for the content size and complexity of your task. Here are brief overviews of the available models.

Large models¶

If you’re not sure where to start, try the most capable models first to establish a baseline to evaluate other models. claude-3-7-sonnet and mistral-large2 are the most capable models offered by Snowflake Cortex, and will give you a good idea what a state-of-the-art model can do.

Claude 3-7 Sonnet is a leader in general reasoning and multimodal capabilities. It outperforms its predecessors in tasks that require reasoning across different domains and modalities. You can use its large output capacity to get more information from either structured or unstructured queries. Its reasoning capabilities and large context windows make it well-suited for agentic workflows.
deepseek-r1 is a foundation model trained using large-scale reinforcement-learning (RL) without supervised fine-tuning (SFT). It can deliver high performance across math, code, and reasoning tasks. To access the model, set the cross-region inference parameter to AWS_US.
mistral-large2 is Mistral AI’s most advanced large language model with top-tier reasoning capabilities. Compared to mistral-large, it’s significantly more capable in code generation, mathematics, reasoning, and provides much stronger multilingual support. It’s ideal for complex tasks that require large reasoning capabilities or are highly specialized, such as synthetic text generation, code generation, and multilingual text analytics.
llama3.1-405b is an open source model from the llama3.1 model family from Meta with a large 128K context window. It excels in long document processing, multilingual support, synthetic data generation and model distillation.
snowflake-llama3.1-405b is a model derived from the open source llama3.1 model. It uses the SwiftKV optimizations developed by the Snowflake AI research team to deliver up to a 75% inference cost reduction. SwiftKV achieves higher throughput performance with minimal accuracy loss.

Medium models¶

llama3.1-70b is an open source model that demonstrates state-of-the-art performance ideal for chat applications, content creation, and enterprise applications. It is a highly performant, cost effective model that enables diverse use cases with a context window of 128K. llama3-70b is still supported and has a context window of 8K.
snowflake-llama3.3-70b is a model derived from the open source llama3.3 model. It uses the SwiftKV optimizations developed by the Snowflake AI research team to deliver up to a 75% inference cost reduction. SwiftKV achieves higher throughput performance with minimal accuracy loss.
snowflake-arctic is Snowflake’s top-tier enterprise-focused LLM. Arctic excels at enterprise tasks such as SQL generation, coding and instruction following benchmarks.
mixtral-8x7b is ideal for text generation, classification, and question answering. Mistral models are optimized for low latency with low memory requirements, which translates into higher throughput for enterprise use cases.

Small models¶

llama3.1-8b is ideal for tasks that require low to moderate reasoning. It’s a light-weight, ultra-fast model with a context window of 128K. llama3-8b provides a smaller context window and relatively lower accuracy.
mistral-7b is ideal for your simplest summarization, structuration, and question answering tasks that need to be done quickly. It offers low latency and high throughput processing for multiple pages of text with its 32K context window.

The following table provides information on how popular models perform on various benchmarks, including the models offered by Snowflake Cortex AI_COMPLETE as well as a few other popular models.

Model	Context Window (Tokens)	MMLU (Reasoning)	HumanEval (Coding)	GSM8K (Arithmetic Reasoning)	Spider 1.0 (SQL)
GPT 4.o	128,000	88.7	90.2	96.4	-
Claude 3.5 Sonnet	200,000	88.3	92.0	96.4	-
llama3.1-405b	128,000	88.6	89	96.8	-
llama3.1-70b	128,000	86	80.5	95.1	-
mistral-large2	128,000	84	92	93	-
llama3.1-8b	128,000	73	72.6	84.9	-
mixtral-8x7b	32,000	70.6	40.2	60.4	-
Snowflake Arctic	4,096	67.3	64.3	69.7	79
mistral-7b	32,000	62.5	26.2	52.1	-
GPT 3.5 Turbo^*	4,097	70	48.1	57.1	-

Previous model versions¶

The Snowflake Cortex AI_COMPLETE and COMPLETE functions also supports the following older model versions. We recommend using the latest model versions instead of the versions listed in this table.

Model	Context Window (Tokens)	MMLU (Reasoning)	HumanEval (Coding)	GSM8K (Arithmetic Reasoning)	Spider 1.0 (SQL)
mistral-large	32,000	81.2	45.1	81	81
llama-2-70b-chat	4,096	68.9	30.5	57.5	-

Using Snowflake Cortex AI Functions with Python¶

Call Cortex AI Functions in Snowpark Python¶

You can use Snowflake Cortex AI Functions in the Snowpark Python API. These functions include the following. Note that the functions in Snowpark Python have names in Pythonic “snake_case” format, with words separated by underscores and all letters in lowercase.

`ai_agg` example¶

The ai_agg function aggregates a column of text using natural language instructions in a similar manner to how you would ask an analyst to summarize or extract findings from grouped or ungrouped data.

The following example summarizes customer reviews for each product using the ai_agg function. The function takes a column of text and a natural language instruction to summarize the reviews.

from snowflake.snowpark.functions import ai_agg, col

df = session.create_dataframe([
    [1, "Excellent product!"],
    [1, "Great battery life."],
    [1, "A bit expensive but worth it."],
    [2, "Terrible customer service."],
    [2, "Won’t buy again."],
], schema=["product_id", "review"])

# Summarize reviews per product
summary_df = df.group_by("product_id").agg(
    ai_agg(col("review"), "Summarize the customer reviews in one sentence.")
)
summary_df.show()

Copy

Note

Use task descriptions that are detailed and centered around the use case. For example, “Summarize the customer feedback for an investor report”.

Classify text with `ai_classify`¶

The ai_classify function takes a string or image and classifies it into the categories that you define.

The following example classifies travel reviews into categories such as “travel” and “cooking”. The function takes a column of text and a list of categories to classify the text into.

from snowflake.snowpark.functions import ai_classify, col

df = session.create_dataframe([
    ["I dream of backpacking across South America."],
    ["I made the best pasta yesterday."],
], schema=["sentence"])

df = df.select(
    "sentence",
    ai_classify(col("sentence"), ["travel", "cooking"]).alias("classification")
)
df.show()

Copy

Note

You can provide up to 500 categories. You can classify both text and images.

Filter rows with `ai_filter`¶

The ai_filter function evaluates a natural language condition and returns True or False. You can use it to filter or tag rows.

from snowflake.snowpark.functions import ai_filter, prompt, col

df = session.create_dataframe(["Canada", "Germany", "Japan"], schema=["country"])

filtered_df = df.select(
    "country",
    ai_filter(prompt("Is {0} in Asia?", col("country"))).alias("is_in_asia")
)
filtered_df.show()

Copy

Note

You can filter on both strings and files. For dynamic prompts, use the :code:prompt function. For more information, see Snowpark Python reference.

Call Cortex AI Functions in Snowflake ML¶

Snowflake ML contains the older AI Functions, those with names that don’t begin with “AI”. These functions are supported in version 1.1.2 and later of Snowflake ML. The names are rendered in Pythonic “snake_case” format, with words separated by underscores and all letters in lowercase.

If you run your Python script outside of Snowflake, you must create a Snowpark session to use these functions. See Connecting to Snowflake for instructions.

Process single values¶

The following Python example illustrates calling Snowflake Cortex AI functions on single values:

from snowflake.cortex import complete, extract_answer, sentiment, summarize, translate

text = """
    The Snowflake company was co-founded by Thierry Cruanes, Marcin Zukowski,
    and Benoit Dageville in 2012 and is headquartered in Bozeman, Montana.
"""

print(complete("llama3.1-8b", "how do snowflakes get their unique patterns?"))
print(extract_answer(text, "When was snowflake founded?"))
print(sentiment("I really enjoyed this restaurant. Fantastic service!"))
print(summarize(text))
print(translate(text, "en", "fr"))

Copy

Pass hyperparameter options¶

You can pass options that affect the model’s hyperparameters when using the complete function. The following Python example illustrates modifying the maximum number of output tokens that the model can generate:

from snowflake.cortex import complete, CompleteOptions

model_options1 = CompleteOptions(
    {'max_tokens':30}
)

print(complete("llama3.1-8b", "how do snowflakes get their unique patterns?", options=model_options1))

Copy

Call functions on table columns¶

You can call an AI function on a table column, as shown below. This example requires a session object (stored in session) and a table articles containing a text column abstract_text, and creates a new column abstract_summary containing a summary of the abstract.

from snowflake.cortex import summarize
from snowflake.snowpark.functions import col

article_df = session.table("articles")
article_df = article_df.withColumn(
    "abstract_summary",
    summarize(col("abstract_text"))
)
article_df.collect()

Copy

Note

The advanced chat-style (multi-message) form of COMPLETE is not currently supported in Snowflake ML Python.

Using Snowflake Cortex AI functions with Snowflake CLI¶

Snowflake Cortex AI Functions are available in Snowflake CLI version 2.4.0 and later. See Introducing Snowflake CLI for more information about using Snowflake CLI. The functions are the old-style functions, those with names that don’t begin with “AI”.

The following examples illustrate using the snow cortex commands on single values. The -c parameter specifies which connection to use.

Note

The advanced chat-style (multi-message) form of COMPLETE is not currently supported in Snowflake CLI.

snow cortex complete "Is 5 more than 4? Please answer using one word without a period." -c "snowhouse"

Copy

snow cortex extract-answer "what is snowflake?" "snowflake is a company" -c "snowhouse"

Copy

snow cortex sentiment "Mary had a little Lamb" -c "snowhouse"

Copy

snow cortex summarize "John has a car. John's car is blue. John's car is old and John is thinking about buying a new car. There are a lot of cars to choose from and John cannot sleep because it's an important decision for John."

Copy

snow cortex translate herb --to pl

Copy

You can also use files that contain the text you want to use for the commands. For this example, assume that the file about_cortex.txt contains the following content:

Snowflake Cortex gives you instant access to industry-leading large language models (LLMs) trained by researchers at companies like Anthropic, Mistral, Reka, Meta, and Google, including Snowflake Arctic, an open enterprise-grade model developed by Snowflake.

Since these LLMs are fully hosted and managed by Snowflake, using them requires no setup. Your data stays within Snowflake, giving you the performance, scalability, and governance you expect.

Snowflake Cortex features are provided as SQL functions and are also available in Python. The available functions are summarized below.

COMPLETE: Given a prompt, returns a response that completes the prompt. This function accepts either a single prompt or a conversation with multiple prompts and responses.
EMBED_TEXT_768: Given a piece of text, returns a vector embedding that represents that text.
EXTRACT_ANSWER: Given a question and unstructured data, returns the answer to the question if it can be found in the data.
SENTIMENT: Returns a sentiment score, from -1 to 1, representing the detected positive or negative sentiment of the given text.
SUMMARIZE: Returns a summary of the given text.
TRANSLATE: Translates given text from any supported language to any other.

You can then execute the snow cortex summarize command by passing in the filename using the --file parameter, as shown:

snow cortex summarize --file about_cortex.txt

Copy

Snowflake Cortex offers instant access to industry-leading language models, including Snowflake Arctic, with SQL functions for completing prompts (COMPLETE), text embedding (EMBED\_TEXT\_768), extracting answers (EXTRACT\_ANSWER), sentiment analysis (SENTIMENT), summarizing text (SUMMARIZE), and translating text (TRANSLATE).

For more information about these commands, see snow cortex commands.

Legal notices¶

The data classification of inputs and outputs are as set forth in the following table.

Input data classification	Output data classification	Designation
Usage Data	Customer Data	Generally available functions are Covered AI Features. Preview functions are Preview AI Features. [1]

For additional information, refer to Snowflake AI and ML.

Snowflake Cortex AI Functions (including LLM functions)¶

Available functions¶

Cortex AI functions¶

Helper functions¶

Cortex Guard¶

Performance considerations¶

Cortex LLM privileges¶

CORTEX_USER database role¶

CORTEX_EMBED_USER database role¶

Using AI Functions in stored procedures with EXECUTE AS RESTRICTED CALLER¶

Control model access¶

Account-level allowlist parameter¶

Role-based access control (RBAC)¶

Refresh model objects and application roles¶

Grant application roles to user roles¶

Use model objects with supported features¶

Using RBAC with account-level allowlist¶

Common pitfalls¶

Supported features¶

Regional availability¶

Create stage for media files¶

Cortex AI Functions storage best practices¶

Cost considerations¶

Warehouse sizing¶

Track costs for AI services¶

Track credit consumption for Cortex AI Functions¶

Usage quotas¶

Model restrictions¶

Choosing a model¶

Large models¶

Medium models¶

Small models¶

Previous model versions¶

Using Snowflake Cortex AI Functions with Python¶

Call Cortex AI Functions in Snowpark Python¶

ai_agg example¶

Classify text with ai_classify¶

Filter rows with ai_filter¶

Call Cortex AI Functions in Snowflake ML¶

Process single values¶

Pass hyperparameter options¶

Call functions on table columns¶

Using Snowflake Cortex AI functions with Snowflake CLI¶

Legal notices¶

`ai_agg` example¶

Classify text with `ai_classify`¶

Filter rows with `ai_filter`¶