Categories:: String & binary functions (Large Language Model)

COMPLETE (SNOWFLAKE.CORTEX)¶

Note

AI_COMPLETE is the latest version of this function. Use AI_COMPLETE for the latest functionality. You can continue to use COMPLETE (SNOWFLAKE.CORTEX).

Given a prompt, generates a response (completion) using your choice of supported language model.

Note

A variant of this function allows COMPLETE to produce responses to images, including:

Comparing images
Captioning images
Classifying images
Extracting entities from images
Answering questions using data in graphs and charts

See COMPLETE (SNOWFLAKE.CORTEX) (multimodal) for more information.

Syntax¶

SNOWFLAKE.CORTEX.COMPLETE(
    <model>, <prompt_or_history> [ , <options> ] )

Copy

Arguments¶

Required:

model

A string specifying the model to be used. Specify one of the following values.

claude-4-opus
claude-4-sonnet
claude-3-7-sonnet
claude-3-5-sonnet
deepseek-r1
gemma-7b
jamba-1.5-mini
jamba-1.5-large
jamba-instruct
llama2-70b-chat
llama3-8b
llama3-70b
llama3.1-8b
llama3.1-70b
llama3.1-405b
llama3.2-1b
llama3.2-3b
llama3.3-70b
llama4-maverick
llama4-scout
mistral-large
mistral-large2
mistral-7b
mixtral-8x7b
openai-gpt-4.1
openai-o4-mini
reka-core
reka-flash
snowflake-arctic
snowflake-llama-3.1-405b
snowflake-llama-3.3-70b

Supported models might have different costs.

prompt_or_history

The prompt or conversation history to be used to generate a completion.

If options is not present, the prompt given must be a string.

If options is present, the argument must be an array of objects representing a conversation in chronological order. Each object must contain a role key and a content key. The content value is a prompt or a response, depending on the role. The role must be one of the following.

role value

content value

'system'

An initial plain-English prompt to the language model to provide it with background information and instructions for a response style. For example, “Respond in the style of a pirate.” The model does not generate a response to a system prompt. Only one system prompt may be provided, and if it is present, it must be the first in the array.

'user'

A prompt provided by the user. Must follow the system prompt (if there is one) or an assistant response.

'assistant'

A response previously provided by the language model. Must follow a user prompt. Past responses can be used to provide a stateful conversational experience; see Usage Notes.

`role` value	`content` value
`'system'`	An initial plain-English prompt to the language model to provide it with background information and instructions for a response style. For example, “Respond in the style of a pirate.” The model does not generate a response to a system prompt. Only one system prompt may be provided, and if it is present, it must be the first in the array.
`'user'`	A prompt provided by the user. Must follow the system prompt (if there is one) or an assistant response.
`'assistant'`	A response previously provided by the language model. Must follow a user prompt. Past responses can be used to provide a stateful conversational experience; see Usage Notes.

Optional:

options

An object containing zero or more of the following options that affect the model’s hyperparameters. See LLM Settings.

temperature: A value from 0 to 1 (inclusive) that controls the randomness of the output of the language model. A higher temperature (for example, 0.7) results in more diverse and random output, while a lower temperature (such as 0.2) makes the output more deterministic and focused.

Default: 0
top_p: A value from 0 to 1 (inclusive) that controls the randomness and diversity of the language model, generally used as an alternative to temperature. The difference is that top_p restricts the set of possible tokens that the model outputs, while temperature influences which tokens are chosen at each step.

Default: 0
max_tokens: Sets the maximum number of output tokens in the response. Small values can result in truncated responses.

Default: 4096 Maximum allowed value: 8192
guardrails: Filters potentially unsafe and harmful responses from a language model using Cortex Guard. Either TRUE or FALSE.

Default: FALSE
response_format: A JSON schema that the response should follow. This is a SQL sub-object, not a string. If response_format is not specified, the response is a string containing either the response or a serialized JSON object containing the response and information about it.

For more information, see AI_COMPLETE Structured Outputs.

Specifying the options argument, even if it is an empty object ({}), affects how the prompt argument is interpreted and how the response is formatted.

Returns¶

When the options argument is not specified, returns a string containing the response.

When the options argument is given, and this object contains the response_format key, returns a string representation of a JSON object adhering to the specified JSON schema.

When the options argument is given, and this object does not contain the response_format key, returns a string representation of a JSON object containing the following keys.

"choices": An array of the model’s responses. (Currently, only one response is provided.) Each response is an object containing a "messages" key whose value is the model’s response to the latest prompt.
"created": UNIX timestamp (seconds since midnight, January 1, 1970) when the response was generated.
"model": The name of the model that created the response.
"usage": An object recording the number of tokens consumed and generated by this completion. Includes the following sub-keys:
- "completion_tokens": The number of tokens in the generated response.
- "prompt_tokens": The number of tokens in the prompt.
- "total_tokens": The total number of tokens consumed, which is the sum of the other two values.

Access control requirements¶

Users must use a role that has been granted the SNOWFLAKE.CORTEX_USER database role. See Required privileges for more information on this privilege.

Usage notes¶

COMPLETE does not retain any state from one call to the next. To use the COMPLETE function to provide a stateful, conversational experience, pass all previous user prompts and model responses in the conversation as part of the prompt_or_history array (see Templates for Chat Models). Keep in mind that the number of tokens processed increases for each “round,” and costs increase proportionally.

Examples¶

Single response¶

To generate a single response:

SELECT SNOWFLAKE.CORTEX.COMPLETE('snowflake-arctic', 'What are large language models?');

Copy

Responses from table column¶

The following example generates a response from each row of a table (in this example, content is a column from the reviews table). The reviews table contains a column named review_content containing the text of reviews submitted by users. The query returns a critique of each review.

SELECT SNOWFLAKE.CORTEX.COMPLETE(
    'openai-gpt-4.1',
        CONCAT('Critique this review in bullet points: <review>', content, '</review>')
) FROM reviews LIMIT 10;

Copy

Tip

As shown in this example, you can use tagging in the prompt to control the kind of response generated. See A guide to prompting LLaMA 2 for tips.

Controlling temperature and tokens¶

This example illustrates the use of the function’s options argument to control the inference hyperparameters in a single response. Note that in this form of the function, the prompt must be provided as an array, since this form supports multiple prompts and responses.

SELECT SNOWFLAKE.CORTEX.COMPLETE(
    'claude-4-sonnet ',
    [
        {
            'role': 'user',
            'content': 'how does a snowflake get its unique pattern?'
        }
    ],
    {
        'temperature': 0.7,
        'max_tokens': 10
    }
);

Copy

The response is a JSON object containing the message from the language model and other information. Note that the response is truncated as instructed in the options argument.

{
    "choices": [
        {
            "messages": " The unique pattern on a snowflake is"
        }
    ],
    "created": 1708536426,
    "model": "llama2-70b-chat",
    "usage": {
        "completion_tokens": 10,
        "prompt_tokens": 22,
        "guardrail_tokens": 0,
        "total_tokens": 32
    }
}

Copy

Controlling safety¶

This example illustrates the use of the Cortex Guard guardrails argument to filter unsafe and harmful responses from a language model.

SELECT SNOWFLAKE.CORTEX.COMPLETE(
    'mistral-large2',
    [
        {
            'role': 'user',
            'content': <'Prompt that generates an unsafe response'>
        }
    ],
    {
        'guardrails': true
    }
);

Copy

The response is a JSON object, for example:

{
    "choices": [
        {
            "messages": "Response filtered by Cortex Guard"
        }
    ],
    "created": 1718882934,
    "model": "mistral-7b",
    "usage": {
        "completion_tokens": 402,
        "prompt_tokens": 93,
        "guardrails _tokens": 677,
        "total_tokens": 1172
    }
}

Copy

Providing a system prompt¶

This example illustrates the use of a system prompt to provide a sentiment analysis of movie reviews. The prompt argument here is an array of objects, each having an appropriate role value.

SELECT SNOWFLAKE.CORTEX.COMPLETE(
    'llama3.1-70b',
    [
        {'role': 'system', 'content': 'You are a helpful AI assistant. Analyze the movie review text and determine the overall sentiment. Answer with just \"Positive\", \"Negative\", or \"Neutral\"' },
        {'role': 'user', 'content': 'this was really good'}
    ], {}
    ) as response;

Copy

The response is a JSON object containing the response from the language model and other information.

{
    "choices": [
        {
        "messages": " Positive"
        }
    ],
    "created": 1708479449,
    "model": "llama2-70b-chat",
    "usage": {
        "completion_tokens": 3,
        "prompt_tokens": 64,
        "total_tokens": 67
    }
}

Copy

Legal notices¶

The following notice applies to Cortex COMPLETE Structured Output functionality only:

Use of models provided on the Snowflake Model and Service Flow-Down Terms page are subject to the terms specified therein. The data classification of inputs and outputs are as set forth in the following table.

Input data classification	Output data classification	Designation
Usage Data	Customer Data	Covered AI Feature

For the rest of COMPLETE functionality, refer to Snowflake AI and ML for legal notices.

Limitations¶

Snowflake Cortex functions do not support dynamic tables.