OpenAI SDK¶
Use the OpenAI SDK with the Snowflake Cortex LLM REST API to quickly evaluate Snowflake Cortex models. For information about the REST API, see Cortex REST API.
Important
Make sure you’re using an official version of the OpenAI SDK in one of the following languages:
Python
TypeScript/JavaScript
Getting started with the OpenAI SDK¶
To use the OpenAI SDK with Snowflake-hosted models, you must do the following:
Use an official version of the OpenAI SDK.
Provide the following:
Your account identifier in the base URL
Your Snowflake Programmatic Access Token (PAT) as the API key for the OpenAI client. For information about creating a PAT, see Generating a programmatic access token.
The model name in the request. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).
View OpenAI compatible API support for the features supported by the OpenAI SDK.
Simple code examples¶
The following examples show how to make requests to the OpenAI SDKs with Python, JavaScript/TypeScript, and curl.
Use the following code to help you get started with the Python SDK:
from openai import OpenAI
client = OpenAI(
api_key="<SNOWFLAKE_PAT>",
base_url="https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
)
response = client.chat.completions.create(
model="<model_name>",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{
"role": "user",
"content": "How does a snowflake get its unique pattern?"
}
]
)
print(response.choices[0].message)
In the preceding code, specify values for the following:
base_url
: Replace<account-identifier>
with your Snowflake account identifier.api_key
: Replace<SNOWFLAKE_PAT>
with your Snowflake Programmatic Access Token (PAT).model
: Replace<model_name>
with the name of the model you want to use. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).
Use the following code to help you get started with the JavaScript/TypeScript SDK:
import OpenAI from "openai";
const openai = new OpenAI({
apikey="SNOWFLAKE_PAT",
baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
});
const response = await openai.chat.completions.create({
model: "claude-3-7-sonnet",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{
role: "user",
content: "How does a snowflake get its unique pattern?",
},
],
});
console.log(response.choices[0].message);
In the preceding code, specify values for the following:
baseURL
: Replace<account-identifier>
with your Snowflake account identifier.apikey
: ReplaceSNOWFLAKE_PAT
with your Snowflake Personal Access Token (PAT).model
: Replace<model_name>
with the name of the model you want to use. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).
Use the following curl command to make a request to the Snowflake-hosted model:
curl "https://<account-identifier>.snowflakecomputing.com/api/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <SNOWFLAKE_PAT>" \
-d '{
"model": "<model_name>",
"messages": [
{"role": "user", "content": "How does a snowflake get its unique pattern?"}
]
}'
In the preceding code, specify values for the following:
<account-identifier>
: Replace<account-identifier>
with your Snowflake account identifier.<SNOWFLAKE_PAT>
: Replace<SNOWFLAKE_PAT>
with your Snowflake Personal Access Token (PAT).<model_name>
: Replace<model_name>
with the name of the model you want to use. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).
Stream responses¶
You can stream responses from the REST API by setting the stream
parameter to True
in the request.
The following Python code streams a response from the REST API:
from openai import OpenAI
client = OpenAI(
api_key="<SNOWFLAKE_PAT>",
base_url="https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
)
response = client.chat.completions.create(
model="<model_name>",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{
"role": "user",
"content": "How does a snowflake get its unique pattern?"
}
],
stream=True
)
for event in stream:
print(event)
response = client.chat.completions.create(
model="<model_name>",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{
"role": "user",
"content": "How does a snowflake get its unique pattern?"
}
],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="", flush=True)
The following JavaScript/TypeScript code streams a response from the REST API:
import OpenAI from "openai";
const openai = new OpenAI({
apikey="SNOWFLAKE_PAT",
baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
});
const stream = await openai.chat.completions.create({
model: "<model_name>",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{
role: "user",
content: "How does a snowflake get its unique pattern?",
},
],
stream:true,
});
for await (const event of stream) {
console.log(event);
}
The following curl command streams a response from the Snowflake-hosted model:
curl "https://<account-identifier>.snowflakecomputing.com/api/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer SNOWFLAKE_PAT" \
-d '{
"model": "<model_name>",
"messages": [
{"role": "user", "content": "How does a snowflake get its unique pattern?"}
],
"stream": true
}'
Limitations¶
The following are limitations with using the OpenAI SDK with Snowflake-hosted models:
Only the Completions API is supported (you can make requests such as
POST /chat/completions
andchat.completions.create()
)Snowflake only supports up to 4096 tokens in the output. Additional tokens are not returned.
Tool calling is only supported for the Claude Sonnet models. For an example that uses tool calling effectively, see Tool calling with chain of thought example.
Audio isn’t supported.
OpenAI compatible API support¶
This section outlines the compatibility of the Snowflake Cortex OpenAI endpoint with the OpenAI API. The following tables summarize which request and response fields and headers are supported when using the OpenAI SDKs with Snowflake-hosted models.
Field |
Support |
---|---|
model |
Supported |
messages |
Supported |
audio |
Not Supported |
frequency_penalty |
Not Supported |
function_call |
Not Supported |
logit_bias |
Not Supported |
logprobs |
Not Supported |
max_tokens |
Supported (1 - 4096) |
max_completion_tokens |
Not Supported |
metadata |
Not Supported |
modalities |
Not Supported |
n |
Not Supported |
parallel_tool_calls |
Not Supported |
prediction |
Not Supported |
presence_penalty |
Not Supported |
reasoning_effort |
Not Supported |
response_format |
Not Supported |
seed |
Not Supported |
service_tier |
Not Supported |
stop |
Not Supported |
store |
Not Supported |
stream |
Supported |
stream_options |
Supported |
temperature |
Supported |
tool_choice |
Supported |
tools |
Supported |
top_logprobs |
Not Supported |
top_p |
Supported |
user |
Not Supported |
web_search_options |
Not Supported |
Field |
Support |
---|---|
id |
Supported |
object |
Not Supported |
created |
Supported |
model |
Supported |
choices |
Supported |
choices[].index |
Supported (Always single choice) |
choices[].message |
Not Supported |
choices[].message.role |
Not Supported |
choices[].message.content |
Supported |
choices[].message.refusal |
Not Supported |
choices[].message.annotations |
Not Supported |
choices[].message.audio |
Not Supported |
choices[].message.function_call |
Not Supported |
choices[].message.tool_calls |
Supported |
choices[].delta |
Supported |
choices[].delta.content_list[].tool_use_id |
Supported |
choices[].logprobs |
Not Supported |
choices[].finish_reason |
Not Supported |
usage |
Supported |
usage.prompt_tokens |
Supported |
usage.completion_tokens |
Supported |
usage.total_tokens |
Supported |
usage.prompt_tokens_details |
Not Supported |
usage.completion_tokens_details |
Not Supported |
service_tier |
Not Supported |
system_fingerprint |
Not Supported |
Header |
Support |
---|---|
Authorization |
Required |
Content-Type |
Supported (application/json) |
Accept |
Supported (application/json, text/event-stream) |
Header |
Support |
---|---|
openai-organization |
Not Supported |
openai-version |
Not Supported |
openai-processing-ms |
Not Supported |
x-ratelimit-limit-requests |
Not Supported |
x-ratelimit-limit-tokens |
Not Supported |
x-ratelimit-remaining-requests |
Not Supported |
x-ratelimit-remaining-tokens |
Not Supported |
x-ratelimit-reset-requests |
Not Supported |
x-ratelimit-reset-tokens |
Not Supported |
retry-after |
Not Supported |