OpenAI SDK¶

Use the OpenAI SDK with the Snowflake Cortex LLM REST API to quickly evaluate Snowflake Cortex models. For information about the REST API, see Cortex REST API.

Important

Make sure you’re using an official version of the OpenAI SDK in one of the following languages:

  • Python

  • TypeScript/JavaScript

Getting started with the OpenAI SDK¶

To use the OpenAI SDK with Snowflake-hosted models, you must do the following:

Simple code examples¶

The following examples show how to make requests to the OpenAI SDKs with Python, JavaScript/TypeScript, and curl.

Use the following code to help you get started with the Python SDK:

from openai import OpenAI

client = OpenAI(
api_key="<SNOWFLAKE_PAT>",
base_url="https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
)

response = client.chat.completions.create(
model="<model_name>",
messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {
          "role": "user",
          "content": "How does a snowflake get its unique pattern?"
      }
  ]
)

print(response.choices[0].message)
Copy

In the preceding code, specify values for the following:

  • base_url: Replace <account-identifier> with your Snowflake account identifier.

  • api_key: Replace <SNOWFLAKE_PAT> with your Snowflake Programmatic Access Token (PAT).

  • model: Replace <model_name> with the name of the model you want to use. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).

Use the following code to help you get started with the JavaScript/TypeScript SDK:

import OpenAI from "openai";

const openai = new OpenAI({
    apikey="SNOWFLAKE_PAT",
    baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
});

const response = await openai.chat.completions.create({
    model: "claude-3-7-sonnet",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        {
            role: "user",
            content: "How does a snowflake get its unique pattern?",
        },
    ],
});

console.log(response.choices[0].message);
Copy

In the preceding code, specify values for the following:

  • baseURL: Replace <account-identifier> with your Snowflake account identifier.

  • apikey: Replace SNOWFLAKE_PAT with your Snowflake Personal Access Token (PAT).

  • model: Replace <model_name> with the name of the model you want to use. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).

Use the following curl command to make a request to the Snowflake-hosted model:

curl "https://<account-identifier>.snowflakecomputing.com/api/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <SNOWFLAKE_PAT>" \
-d '{
    "model": "<model_name>",
    "messages": [
        {"role": "user", "content": "How does a snowflake get its unique pattern?"}
    ]
    }'
Copy

In the preceding code, specify values for the following:

  • <account-identifier>: Replace <account-identifier> with your Snowflake account identifier.

  • <SNOWFLAKE_PAT>: Replace <SNOWFLAKE_PAT> with your Snowflake Personal Access Token (PAT).

  • <model_name>: Replace <model_name> with the name of the model you want to use. For a list of supported models, see Snowflake Cortex AISQL (including LLM functions).

Stream responses¶

You can stream responses from the REST API by setting the stream parameter to True in the request.

The following Python code streams a response from the REST API:

from openai import OpenAI

client = OpenAI(
api_key="<SNOWFLAKE_PAT>",
base_url="https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
)

response = client.chat.completions.create(
model="<model_name>",
messages=[
  {"role": "system", "content": "You are a helpful assistant."},
  {
      "role": "user",
      "content": "How does a snowflake get its unique pattern?"
  }
],
  stream=True
)

for event in stream:
print(event)
response = client.chat.completions.create(
model="<model_name>",
messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {
          "role": "user",
          "content": "How does a snowflake get its unique pattern?"
      }
  ],
  stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="", flush=True)
Copy

The following JavaScript/TypeScript code streams a response from the REST API:

import OpenAI from "openai";

const openai = new OpenAI({
    apikey="SNOWFLAKE_PAT",
    baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v1/openai"
});

const stream = await openai.chat.completions.create({
    model: "<model_name>",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        {
            role: "user",
            content: "How does a snowflake get its unique pattern?",
        },
    ],
    stream:true,
});


for await (const event of stream) {
console.log(event);
}
Copy

The following curl command streams a response from the Snowflake-hosted model:

curl "https://<account-identifier>.snowflakecomputing.com/api/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer SNOWFLAKE_PAT" \
-d '{
    "model": "<model_name>",
    "messages": [
        {"role": "user", "content": "How does a snowflake get its unique pattern?"}
    ],
    "stream": true
    }'
Copy

Limitations¶

The following are limitations with using the OpenAI SDK with Snowflake-hosted models:

  • Only the Completions API is supported (you can make requests such as POST /chat/completions and chat.completions.create())

  • Snowflake only supports up to 4096 tokens in the output. Additional tokens are not returned.

  • Tool calling is only supported for the Claude Sonnet models. For an example that uses tool calling effectively, see Tool calling with chain of thought example.

  • Audio isn’t supported.

OpenAI compatible API support¶

This section outlines the compatibility of the Snowflake Cortex OpenAI endpoint with the OpenAI API. The following tables summarize which request and response fields and headers are supported when using the OpenAI SDKs with Snowflake-hosted models.

Request fields¶

Field

Support

model

Supported

messages

Supported

audio

Not Supported

frequency_penalty

Not Supported

function_call

Not Supported

logit_bias

Not Supported

logprobs

Not Supported

max_tokens

Supported (1 - 4096)

max_completion_tokens

Not Supported

metadata

Not Supported

modalities

Not Supported

n

Not Supported

parallel_tool_calls

Not Supported

prediction

Not Supported

presence_penalty

Not Supported

reasoning_effort

Not Supported

response_format

Not Supported

seed

Not Supported

service_tier

Not Supported

stop

Not Supported

store

Not Supported

stream

Supported

stream_options

Supported

temperature

Supported

tool_choice

Supported

tools

Supported

top_logprobs

Not Supported

top_p

Supported

user

Not Supported

web_search_options

Not Supported

Response fields¶

Field

Support

id

Supported

object

Not Supported

created

Supported

model

Supported

choices

Supported

choices[].index

Supported (Always single choice)

choices[].message

Not Supported

choices[].message.role

Not Supported

choices[].message.content

Supported

choices[].message.refusal

Not Supported

choices[].message.annotations

Not Supported

choices[].message.audio

Not Supported

choices[].message.function_call

Not Supported

choices[].message.tool_calls

Supported

choices[].delta

Supported

choices[].delta.content_list[].tool_use_id

Supported

choices[].logprobs

Not Supported

choices[].finish_reason

Not Supported

usage

Supported

usage.prompt_tokens

Supported

usage.completion_tokens

Supported

usage.total_tokens

Supported

usage.prompt_tokens_details

Not Supported

usage.completion_tokens_details

Not Supported

service_tier

Not Supported

system_fingerprint

Not Supported

Request headers¶

Header

Support

Authorization

Required

Content-Type

Supported (application/json)

Accept

Supported (application/json, text/event-stream)

Response headers¶

Header

Support

openai-organization

Not Supported

openai-version

Not Supported

openai-processing-ms

Not Supported

x-ratelimit-limit-requests

Not Supported

x-ratelimit-limit-tokens

Not Supported

x-ratelimit-remaining-requests

Not Supported

x-ratelimit-remaining-tokens

Not Supported

x-ratelimit-reset-requests

Not Supported

x-ratelimit-reset-tokens

Not Supported

retry-after

Not Supported