- Categories:
String & binary functions (AI Functions)
AI_ COMPLETE (Single string)¶
Generates a response (completion) for a text prompt using a supported language model.
Syntax¶
The function contains two required arguments and four optional arguments. The function can be used with either positional or named argument syntax.
Using AI_COMPLETE with a single string input
Arguments¶
modelA string specifying the model to be used.
Supported models might have different costs.
promptA string prompt
model_parametersAn object containing zero or more of the following options that affect the model’s hyperparameters. See LLM Settings.
-
temperature: A value from 0 to 1 (inclusive) that controls the randomness of the output of the language model. A higher temperature (for example, 0.7) results in more diverse and random output, while a lower temperature (such as 0.2) makes the output more deterministic and focused.Default: 0
-
top_p: A value from 0 to 1 (inclusive) that controls the randomness and diversity of the language model, generally used as an alternative totemperature. The difference is thattop_prestricts the set of possible tokens that the model outputs, whiletemperatureinfluences which tokens are chosen at each step.Default: 0
-
max_tokens: Sets the maximum number of output tokens in the response. Small values can result in truncated responses.Default: 4096 Maximum allowed value: 8192
-
guardrails: Filters potentially unsafe and harmful responses from a language model using Cortex Guard. Either TRUE or FALSE.Default: FALSE
-
response_formatThe format that the response should follow. You can specify the response format as:
- A JSON schema that the response should follow. This is a SQL sub-object, not a string.
- A SQL type literal beginning with the TYPE keyword. The defined type must use an OBJECT as its top-level container, and fields of this OBJECT are mapped to corresponding JSON fields and values.
If
response_formatis not specified, the response is a string containing either the response or a serialized JSON object containing the response and information about it.For more information, see AI_COMPLETE structured outputs.
show_detailsA boolean flag that indicates whether to return a serialized JSON object containing the response and information about it.
Returns¶
When the show_details argument is not specified or set to FALSE and the response_format is not specified or set to NULL, returns a string containing the response.
When the show_details argument is not specified or set to FALSE and the response_format is specified, returns an object following the provided response format.
When the show_details argument is set to TRUE and the response_format is not specified, returns a
JSON object containing the following keys.
"choices": An array of the model’s responses. (Currently, only one response is provided.) Each response is an object containing a"messages"key whose value is the model’s response to the latest prompt."created": UNIX timestamp (seconds since midnight, January 1, 1970) when the response was generated."model": The name of the model that created the response."usage": An object recording the number of tokens consumed and generated by this completion. Includes the following sub-keys:"completion_tokens": The number of tokens in the generated response."prompt_tokens": The number of tokens in the prompt."total_tokens": The total number of tokens consumed, which is the sum of the other two values.
When the show_details argument is set to TRUE and the response_format is specified, returns a
JSON object containing the following keys.
"structured_output": A json object following the specified response format."created": UNIX timestamp (seconds since midnight, January 1, 1970) when the response was generated."model": The name of the model that created the response."usage": An object recording the number of tokens consumed and generated by this completion. Includes the following sub-keys:"completion_tokens": The number of tokens in the generated response."prompt_tokens": The number of tokens in the prompt."total_tokens": The total number of tokens consumed, which is the sum of the other two values.
Cortex Guard¶
Cortex Guard is an option of the AI_COMPLETE (or SNOWFLAKE.CORTEX.COMPLETE) function designed to filter possible unsafe and harmful responses from a language model. Cortex Guard is currently built with Meta’s Llama Guard 3. Cortex Guard works by evaluating the responses of a language model before that output is returned to the application. Once you activate Cortex Guard, language model responses which may be associated with violent crimes, hate, sexual content, self-harm, and more are automatically filtered.
To enable Cortex Guard, set the guardrails option in the model_parameters argument to TRUE. For an example, see Filtering harmful responses with Cortex Guard.
Note
Usage of Cortex Guard incurs compute charges based on the number of input tokens processed, in addition to the charges for the AI_COMPLETE function.
Examples¶
Single response¶
To generate a single response:
Responses from table column¶
The following example generates a response for each row in the reviews table, using the content column as input. Each query result contains a critique of the corresponding review.
Tip
As shown in this example, you can use tagging in the prompt to control the kind of response generated. See A guide to prompting LLaMA 2 for tips.
Controlling model parameters¶
The following example specifies the model_parameters used to provide a response.
The response is a string containing the message from the language model and other information. Note that the response
is truncated as instructed in the model_parameters argument.
Detailed output¶
The following example shows how you can use the show_details argument to return additional inference details.
The response is a JSON object with the model’s message and related details. The model_parameters argument was used to truncate the output.
Filtering harmful responses with Cortex Guard¶
The following example enables Cortex Guard by setting the guardrails option in the
model_parameters argument to TRUE. Responses that Cortex Guard classifies as unsafe are filtered before being returned.
Specifying a JSON response format¶
This example illustrates the use of the function’s response_format argument to return a structured response by providing a type literal.
The response is a JSON object following the structured response format.
Response:
Specifying a JSON response format with details, using a type literal¶
This example illustrates the use of response_format argument to return a structured response combined with show_details to get additional inference information, using a type literal.
The response is a JSON object containing structured response with additional inference metadata.
Specifying a JSON response format with details, using a JSON schema¶
This example illustrates the use of the function’s response_format argument to return a structured response combined with show_details to get additional inference information, using a JSON schema.
The response is a json object containing structured response with additional inference metadata.
Note
AI_COMPLETE is the updated version of COMPLETE. For the latest functionality, use AI_COMPLETE.
Legal notices¶
Refer to Snowflake AI and ML for legal notices.