- Categories:
String & binary functions (Large Language Model)
AI_COMPLETE (Single string)¶
Note
AI_COMPLETE is the updated version of COMPLETE (SNOWFLAKE.CORTEX). For the latest functionality, use AI_COMPLETE.
Generates a response (completion) for a text prompt using a supported language model.
Syntax¶
The function contains two required arguments and four optional arguments. The function can be used with either positional or named argument syntax.
Using AI_COMPLETE with a single string input
AI_COMPLETE(
<model>, <prompt> [ , <model_parameters>, <response_format>, <show_details> ] )
Arguments¶
model
A string specifying the model to be used. Specify one of the following models:
claude-4-opus
claude-4-sonnet
claude-3-7-sonnet
claude-3-5-sonnet
deepseek-r1
gemma-7b
jamba-1.5-mini
jamba-1.5-large
jamba-instruct
llama2-70b-chat
llama3-8b
llama3-70b
llama3.1-8b
llama3.1-70b
llama3.1-405b
llama3.2-1b
llama3.2-3b
llama3.3-70b
llama4-maverick
llama4-scout
mistral-large
mistral-large2
mistral-7b
mixtral-8x7b
openai-gpt-4.1
openai-o4-mini
reka-core
reka-flash
snowflake-arctic
snowflake-llama-3.1-405b
snowflake-llama-3.3-70b
Supported models might have different costs.
prompt
A string prompt
model_parameters
An object containing zero or more of the following options that affect the model’s hyperparameters. See LLM Settings.
temperature
: A value from 0 to 1 (inclusive) that controls the randomness of the output of the language model. A higher temperature (for example, 0.7) results in more diverse and random output, while a lower temperature (such as 0.2) makes the output more deterministic and focused.Default: 0
top_p
: A value from 0 to 1 (inclusive) that controls the randomness and diversity of the language model, generally used as an alternative totemperature
. The difference is thattop_p
restricts the set of possible tokens that the model outputs, whiletemperature
influences which tokens are chosen at each step.Default: 0
max_tokens
: Sets the maximum number of output tokens in the response. Small values can result in truncated responses.Default: 4096 Maximum allowed value: 8192
guardrails
: Filters potentially unsafe and harmful responses from a language model using Cortex Guard. Either TRUE or FALSE.Default: FALSE
response_format
A JSON schema that the response should follow. This is a SQL sub-object, not a string. If
response_format
is not specified, the response is a string containing either the response or a serialized JSON object containing the response and information about it.For more information, see AI_COMPLETE Structured Outputs.
show_details
A boolean flag that indicates whether to return a serialized JSON object containing the response and information about it.
Returns¶
When the show_details
argument is not specified or set to FALSE and the response_format
is not specified or set to NULL, returns a string containing the response.
When the show_details
argument is not specified or set to FALSE and the response_format
is specified, returns an object following the provided response format.
When the show_details
argument is set to TRUE and the response_format
is not specified, returns a
a JSON object containing the following keys.
"choices"
: An array of the model’s responses. (Currently, only one response is provided.) Each response is an object containing a"messages"
key whose value is the model’s response to the latest prompt."created"
: UNIX timestamp (seconds since midnight, January 1, 1970) when the response was generated."model"
: The name of the model that created the response."usage"
: An object recording the number of tokens consumed and generated by this completion. Includes the following sub-keys:"completion_tokens"
: The number of tokens in the generated response."prompt_tokens"
: The number of tokens in the prompt."total_tokens"
: The total number of tokens consumed, which is the sum of the other two values.
When the show_details
argument is set to TRUE and the response_format
is specified, returns a
a JSON object containing the following keys
"structured_output"
: A json object following the specified resoonse format"created"
: UNIX timestamp (seconds since midnight, January 1, 1970) when the response was generated."model"
: The name of the model that created the response."usage"
: An object recording the number of tokens consumed and generated by this completion. Includes the following sub-keys:"completion_tokens"
: The number of tokens in the generated response."prompt_tokens"
: The number of tokens in the prompt."total_tokens"
: The total number of tokens consumed, which is the sum of the other two values.
Examples¶
Single response¶
To generate a single response:
SELECT AI_COMPLETE('snowflake-arctic', 'What are large language models?');
Responses from table column¶
The following example generates a response for each row in the reviews
table, using the content
column as input. Each query result contains a critique of the corresponding review.
SELECT AI_COMPLETE(
'mistral-large',
CONCAT('Critique this review in bullet points: <review>', content, '</review>')
) FROM reviews LIMIT 10;
Tip
As shown in this example, you can use tagging in the prompt to control the kind of response generated. See A guide to prompting LLaMA 2 for tips.
Controlling model parameters¶
The following example specifies the model_parameters
used to provide a response.
SELECT AI_COMPLETE(
model => 'llama2-70b-chat',
prompt => 'how does a snowflake get its unique pattern?',
model_parameters => {
'temperature': 0.7,
'max_tokens': 10
}
);
The response is a string containing the message from the language model and other information. Note that the response
is truncated as instructed in the model_parameters
argument.
"The unique pattern on a snowflake is"
Detailed output¶
The following example shows how you can use the show_details
argument to return additional inference details.
SELECT AI_COMPLETE(
model => 'llama2-70b-chat',
prompt => 'how does a snowflake get its unique pattern?',
model_parameters => {
'temperature': 0.7,
'max_tokens': 10
},
show_details => true
);
The response is a JSON object with the model’s message and related details. The options
argument was used to truncate the output.
{
"choices": [
{
"messages": " The unique pattern on a snowflake is"
}
],
"created": 1708536426,
"model": "llama2-70b-chat",
"usage": {
"completion_tokens": 10,
"prompt_tokens": 22,
"guardrail_tokens": 0,
"total_tokens": 32
}
}
Specifying a JSON response format¶
This example illustrates the use of the function’s response_format
argument to return a structured response
SELECT AI_COMPLETE(
model => 'llama2-70b-chat',
prompt => 'Extract structured data from this customer interaction note: Customer Sarah Jones complained about the mobile app crashing during checkout. She tried to purchase 3 items: a red XL jacket ($89.99), blue running shoes ($129.50), and a fitness tracker ($199.00). The app crashed after she entered her shipping address at 123 Main St, Portland OR, 97201. She has been a premium member since January 2024.',
model_parameters => {
'temperature': 0,
'max_tokens': 4096
},
response_format => {
'type':'json',
'schema':{'type' : 'object','properties' : {'sentiment_categories':{'type':'array','items':{'type':'object','properties':
{'food_quality' : {'type' : 'string'},'food_taste': {'type':'string'}, 'wait_time': {'type':'string'}, 'food_cost': {'type':'string'}},'required':['food_quality','food_taste' ,'wait_time','food_cost']}}}}
}
);
The response is a json object following the structured response format.
Response:
{
"sentiment_categories": [
{
"food_cost": "negative",
"food_quality": "positive",
"food_taste": "positive",
"wait_time": "neutral"
}
]
}
Specifying a JSON response format with detailed output¶
This example illustrates the use of the function’s response_format
argument to return a structured response combined with show_details
to get additional inference information
SELECT AI_COMPLETE(
model => 'llama2-70b-chat',
prompt => 'Extract structured data from this customer interaction note: Customer Sarah Jones complained about the mobile app crashing during checkout. She tried to purchase 3 items: a red XL jacket ($89.99), blue running shoes ($129.50), and a fitness tracker ($199.00). The app crashed after she entered her shipping address at 123 Main St, Portland OR, 97201. She has been a premium member since January 2024.',
model_parameters => {
'temperature': 0,
'max_tokens': 4096
},
response_format => {
'type':'json',
'schema':{'type' : 'object','properties' : {'sentiment_categories':{'type':'array','items':{'type':'object','properties':
{'food_quality' : {'type' : 'string'},'food_taste': {'type':'string'}, 'wait_time': {'type':'string'}, 'food_cost': {'type':'string'}},'required':['food_quality','food_taste' ,'wait_time','food_cost']}}}}
},
show_details => true
);
The response is a json object containing structured response with addtional inference metadata.
{
"created": 1738683744,
"model": "mistral-large2",
"structured_output": [
{
"raw_message": {
"sentiment_categories": [
{
"food_cost": "negative",
"food_quality": "positive",
"food_taste": "positive",
"wait_time": "neutral"
}
]
},
"type": "json"
}
],
"usage": {
"completion_tokens": 60,
"prompt_tokens": 94,
"total_tokens": 154
}
}