OpenAI SDK¶

Use the OpenAI SDK with the Snowflake Cortex LLM REST API to quickly evaluate Snowflake Cortex models. For information about the REST API, see Cortex REST API.

Important

Make sure you’re using an official version of the OpenAI SDK in one of the following languages:

Python
TypeScript/JavaScript

Getting started with the OpenAI SDK¶

To use the OpenAI SDK with Snowflake-hosted models, you must do the following:

Use an official version of the OpenAI SDK.
Provide the following:
- Your account identifier in the base URL
- Your Snowflake Programmatic Access Token (PAT) as the API key for the OpenAI client. For information about creating a PAT, see Generating a programmatic access token.
- The model name in the request. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).
View OpenAI compatible API support for the features supported by the OpenAI SDK.

Simple code examples¶

The following examples show how to make requests to the OpenAI SDKs with Python, JavaScript/TypeScript, and curl.

Use the following code to help you get started with the Python SDK:

from openai import OpenAI

client = OpenAI(
api_key="<SNOWFLAKE_PAT>",
base_url="https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
)

response = client.chat.completions.create(
model="<model_name>",
messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {
          "role": "user",
          "content": "How does a snowflake get its unique pattern?"
      }
  ]
)

print(response.choices[0].message)

Copy

In the preceding code, specify values for the following:

base_url: Replace <account-identifier> with your Snowflake account identifier.
api_key: Replace <SNOWFLAKE_PAT> with your Snowflake Programmatic Access Token (PAT).
model: Replace <model_name> with the name of the model you want to use. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).

Use the following code to help you get started with the JavaScript/TypeScript SDK:

import OpenAI from "openai";

const openai = new OpenAI({
    apikey="SNOWFLAKE_PAT",
    baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
});

const response = await openai.chat.completions.create({
    model: "claude-3-7-sonnet",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        {
            role: "user",
            content: "How does a snowflake get its unique pattern?",
        },
    ],
});

console.log(response.choices[0].message);

Copy

In the preceding code, specify values for the following:

baseURL: Replace <account-identifier> with your Snowflake account identifier.
apikey: Replace SNOWFLAKE_PAT with your Snowflake Personal Access Token (PAT).
model: Replace <model_name> with the name of the model you want to use. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).

Use the following curl command to make a request to the Snowflake-hosted model:

curl "https://<account-identifier>.snowflakecomputing.com/api/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <SNOWFLAKE_PAT>" \
-d '{
    "model": "<model_name>",
    "messages": [
        {"role": "user", "content": "How does a snowflake get its unique pattern?"}
    ]
    }'

Copy

In the preceding code, specify values for the following:

<account-identifier>: Replace <account-identifier> with your Snowflake account identifier.
<SNOWFLAKE_PAT>: Replace <SNOWFLAKE_PAT> with your Snowflake Personal Access Token (PAT).
<model_name>: Replace <model_name> with the name of the model you want to use. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).

Stream responses¶

You can stream responses from the REST API by setting the stream parameter to True in the request.

The following Python code streams a response from the REST API:

from openai import OpenAI

client = OpenAI(
api_key="<SNOWFLAKE_PAT>",
base_url="https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
)

response = client.chat.completions.create(
model="<model_name>",
messages=[
  {"role": "system", "content": "You are a helpful assistant."},
  {
      "role": "user",
      "content": "How does a snowflake get its unique pattern?"
  }
],
  stream=True
)

for event in stream:
print(event)
response = client.chat.completions.create(
model="<model_name>",
messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {
          "role": "user",
          "content": "How does a snowflake get its unique pattern?"
      }
  ],
  stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="", flush=True)

Copy

The following JavaScript/TypeScript code streams a response from the REST API:

import OpenAI from "openai";

const openai = new OpenAI({
    apikey="SNOWFLAKE_PAT",
    baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
});

const stream = await openai.chat.completions.create({
    model: "<model_name>",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        {
            role: "user",
            content: "How does a snowflake get its unique pattern?",
        },
    ],
    stream:true,
});


for await (const event of stream) {
console.log(event);
}

Copy

The following curl command streams a response from the Snowflake-hosted model:

curl "https://<account-identifier>.snowflakecomputing.com/api/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer SNOWFLAKE_PAT" \
-d '{
    "model": "<model_name>",
    "messages": [
        {"role": "user", "content": "How does a snowflake get its unique pattern?"}
    ],
    "stream": true
    }'

Copy

Limitations¶

The following are limitations with using the OpenAI SDK with Snowflake-hosted models:

Only the Completions API is supported (you can make requests such as POST /chat/completions and chat.completions.create())
Snowflake only supports up to 4096 tokens in the output. Additional tokens are not returned.
Tool calling is only supported for the Claude Sonnet models. For an example that uses tool calling effectively, see Tool calling with chain of thought example.
Audio isn’t supported.

OpenAI compatible API support¶

This section outlines the compatibility of the Snowflake Cortex OpenAI endpoint with the OpenAI API. The following tables summarize which request and response fields and headers are supported when using the OpenAI SDKs with Snowflake-hosted models.

Request fields¶
Field	Support
model	Supported
messages	Supported
audio	Not Supported
frequency_penalty	Not Supported
function_call	Not Supported
logit_bias	Not Supported
logprobs	Not Supported
max_tokens	Supported (1 - 4096)
max_completion_tokens	Not Supported
metadata	Not Supported
modalities	Not Supported
n	Not Supported
parallel_tool_calls	Not Supported
prediction	Not Supported
presence_penalty	Not Supported
reasoning_effort	Not Supported
response_format	Not Supported
seed	Not Supported
service_tier	Not Supported
stop	Not Supported
store	Not Supported
stream	Supported
stream_options	Supported
temperature	Supported
tool_choice	Supported
tools	Supported
top_logprobs	Not Supported
top_p	Supported
user	Not Supported
web_search_options	Not Supported

Response fields¶
Field	Support
id	Supported
object	Not Supported
created	Supported
model	Supported
choices	Supported
choices[].index	Supported (Always single choice)
choices[].message	Not Supported
choices[].message.role	Not Supported
choices[].message.content	Supported
choices[].message.refusal	Not Supported
choices[].message.annotations	Not Supported
choices[].message.audio	Not Supported
choices[].message.function_call	Not Supported
choices[].message.tool_calls	Supported
choices[].delta	Supported
choices[].delta.content_list[].tool_use_id	Supported
choices[].logprobs	Not Supported
choices[].finish_reason	Not Supported
usage	Supported
usage.prompt_tokens	Supported
usage.completion_tokens	Supported
usage.total_tokens	Supported
usage.prompt_tokens_details	Not Supported
usage.completion_tokens_details	Not Supported
service_tier	Not Supported
system_fingerprint	Not Supported

Request headers¶
Header	Support
Authorization	Required
Content-Type	Supported (application/json)
Accept	Supported (application/json, text/event-stream)

Response headers¶
Header	Support
openai-organization	Not Supported
openai-version	Not Supported
openai-processing-ms	Not Supported
x-ratelimit-limit-requests	Not Supported
x-ratelimit-limit-tokens	Not Supported
x-ratelimit-remaining-requests	Not Supported
x-ratelimit-remaining-tokens	Not Supported
x-ratelimit-reset-requests	Not Supported
x-ratelimit-reset-tokens	Not Supported
retry-after	Not Supported