Bring-your-own model for AI_ COMPLETE¶
AI_COMPLETE supports running inference against a user-deployed model inference service on SPCS along with Snowflake’s Cortex models. To use this feature, the model argument needs to point to an SPCS model inference service that hosts a text generation model.
The function accepts either of the following text inputs:
- A text prompt string. For the underlying syntax and behavior, see AI_COMPLETE (Single string).
- A prompt object built with PROMPT that contains text-only template arguments. For the underlying syntax and behavior, see AI_COMPLETE (Prompt object).
FILE input and multimodal prompts are not supported. See Limitations.
Prerequisites¶
Before you can use this feature, log a text generation model from Hugging Face using Hugging Face pipeline, then deploy it as an SPCS model inference service using the Snowsight UI or the Python API.
For more examples on creating SPCS inference services, see Example workflows.
Syntax¶
The function accepts the same arguments as AI_COMPLETE (Single string) and AI_COMPLETE (Prompt object). The only change is that the model argument is an SPCS service name instead of the name of a Cortex model.
Using AI_COMPLETE with a single string input:
Using AI_COMPLETE with a prompt object:
Arguments¶
modelName of an SPCS model inference service which hosts a text generation model from Hugging Face.
promptA string prompt or a prompt object built with PROMPT. FILE input and multimodal prompts are not supported. See Limitations.
model_parametersAn object containing zero or more model hyperparameters, such as
temperature,top_p, andmax_tokens. For full details, see Arguments.The
guardrailsoption is not supported. See Limitations.response_formatThe format that the response should follow, specified as a JSON schema or a SQL TYPE literal. For full details, see Arguments.
show_detailsA boolean flag that indicates whether to return a serialized JSON object containing the response and additional inference details. For full details, see Arguments.
Returns¶
By default, returns a string containing the model’s response.
When the response_format argument is specified, returns an object that follows the provided format. When the show_details argument is set to TRUE, returns a JSON object containing the response and inference metadata. For full details on each return shape, see Returns.
Error behavior¶
The function follows the same error behavior as the standard AI_COMPLETE function. For the full table of return values for the return_error_details argument, see Error behavior.
Unsupported inputs listed in Limitations cause the entire query to fail with an error, regardless of the return_error_details setting. They aren’t reported as per-row NULL values or as {value, error} envelopes.
Examples¶
The examples in this section use a SQL variable for the service name:
You can confirm the service exists with the following command:
Single response¶
Responses from a table column¶
Controlling model parameters¶
Using named arguments¶
Using a PROMPT() template (text-only arguments)¶
Joining with a table¶
Limitations¶
The following AI_COMPLETE inputs are not yet supported when the function points to an SPCS service.
FILE input¶
The single-file overload of AI_COMPLETE is rejected with the error: NOT_YET_IMPLEMENTED: multimodal file input for an AI function calling an SPCS service.
Multimodal PROMPT() and PROMPT_ FORMAT() input¶
FILE, OBJECT, or ARRAY arguments inside PROMPT can potentially make the template multimodal and are rejected with the error: multimodal prompt input for an AI function calling an SPCS service.
guardrails model parameter¶
The guardrails option in model_parameters is rejected.
Access control requirements¶
The user must have the USAGE privilege on the SPCS service and on its inference function.
Legal notices¶
Refer to Snowflake AI and ML for legal notices.