Cortex REST API¶

CortexRESTAPIを使用すると、好みのエンドポイントまたはSDKを経由して、Anthropic、OpenAI、Meta、Mistralなどの主要なフロンティアモデルにアクセスできます。すべての推論はSnowflakeの境界内で実行されるため、データは常にセキュリティ保護され、ガバナンスの境界内に維持されます。開始方法については、以下を参照してください。

APIを選択¶

CortexRESTAPIは、2つの業界標準のAPI仕様をサポートしています。スタックに最適なものを選択してください。


	Chat Completions API	Messages API
互換性	OpenAI Chat Completions API	Anthropic Messages API
エンドポイント	`/api/v2/cortex/v1/chat/completions`	`/api/v2/cortex/v1/messages`
対応モデル	すべてのモデル（OpenAI、Claude、Llama、Mistral、DeepSeek、Snowflake）	Claudeモデルのみ
SDK サポート	OpenAIPythonおよびJavaScriptSDKs	Anthropic PythonSDK
最適な用途	ほとんどのユースケース、マルチモデルの柔軟性	既存のAnthropic統合、AnthropicAPIパリティ

両方のAPIsは、同じ認証、モデルカタログ、およびレート制限を共有します。唯一の違いは、リクエスト/応答の形式と、各エンドポイントがサポートするモデルです。価格については、`Snowflakeサービス利用表<https://www.snowflake.com/legal-files/CreditConsumptionTable.pdf>`_を参照してください。

クイックスタート¶

前提条件¶

始める前に、以下が必要です。

**SnowflakeアカウントURL**（例：https://<account-identifier>.snowflakecomputing.com）。
認証用の**Snowflake Programmaticアクセストークン**（PAT）。プログラムアクセストークンの生成をご参照ください。
リクエストで使用する**モデル名**。利用可能なモデルについては、:ref:`label-cortex_complete_llm_model_availability`を参照してください。

Chat Completionsクイックスタート¶

Chat CompletionsAPIは、OpenAI仕様に従います。OpenAISDKを直接使用できます。

from openai import OpenAI

client = OpenAI(
  api_key="<SNOWFLAKE_PAT>",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
)

response = client.chat.completions.create(
  model="claude-sonnet-4-5",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "How does a snowflake get its unique pattern?"}
  ]
)

print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "<SNOWFLAKE_PAT>",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
});

const response = await client.chat.completions.create({
  model: "claude-sonnet-4-5",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "How does a snowflake get its unique pattern?" }
  ],
});

console.log(response.choices[0].message.content);

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [
      {"role": "user", "content": "How does a snowflake get its unique pattern?"}
    ]
  }'

前述の例では、次を置き換えます。

<account-identifier>:使用するSnowflakeアカウント識別子。
<SNOWFLAKE_PAT>:Snowflake Programmaticアクセストークン（PAT）。
model:モデル名。サポートされているモデルについては、:ref:`label-cortex_complete_llm_model_availability`を参照してください。

MessagesAPIクイックスタート¶

MessagesAPIはAnthropic仕様に従い、Claudeモデルのみをサポートします。

AnthropicSDKはデフォルトで``x-api-key``を介して認証情報を送信しますが、Snowflakeは``Bearer``トークンを想定しています。``httpx``クライアントを使用して、正しい認証ヘッダーを設定します。

import httpx
import anthropic

PAT = "<SNOWFLAKE_PAT>"

http_client = httpx.Client(
  headers={"Authorization": f"Bearer {PAT}"},
)

client = anthropic.Anthropic(
  api_key="not-used",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  http_client=http_client,
  default_headers={"Authorization": f"Bearer {PAT}"},
)

response = client.messages.create(
  model="claude-sonnet-4-5",
  max_tokens=1024,
  messages=[
    {"role": "user", "content": "How does a snowflake get its unique pattern?"}
  ],
)

print(response.content[0].text)

Pythonのように、``defaultHeaders``を介してデフォルトの認証ヘッダーを``Bearer``トークンで上書きします。

import Anthropic from "@anthropic-ai/sdk";

const PAT = "<SNOWFLAKE_PAT>";

const client = new Anthropic({
  apiKey: "not-used",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  defaultHeaders: {
    "Authorization": `Bearer ${PAT}`,
  },
});

const response = await client.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  messages: [
    { role: "user", content: "How does a snowflake get its unique pattern?" }
  ],
});

console.log(response.content[0].text);

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "How does a snowflake get its unique pattern?"}
    ]
  }'

前述の例では、次を置き換えます。

<account-identifier>:使用するSnowflakeアカウント識別子。
<SNOWFLAKE_PAT>:Snowflake Programmaticアクセストークン（PAT）。
model:Claudeモデル名。サポートされているモデルについては、:ref:`label-cortex_complete_llm_model_availability`を参照してください。

認証の設定¶

Cortex REST API への認証コードには、 Snowflakeでの Snowflake REST APIs 認証で説明されている方法を使用できます。

トークン（例：JSONウェブトークン（JWT）、OAuthトークン、または:doc:Programmaticアクセストークン</user-guide/programmatic-access-tokens>）を含めるように``Authorization``ヘッダーを設定します。

Tip

Cortex REST API リクエスト専用のユーザーを作ることを検討します。

認証の設定¶

To send a REST API request, your default role must be granted either the SNOWFLAKE.CORTEX_USER database role or the SNOWFLAKE.CORTEX_REST_API_USER database role. SNOWFLAKE.CORTEX_USER provides access to all Covered AI features including the Cortex REST API, whereas SNOWFLAKE.CORTEX_REST_API_USER provides access only to the Cortex REST API.

In most cases, users already have access because SNOWFLAKE.CORTEX_USER is granted to the PUBLIC role automatically, and all roles inherit PUBLIC.

Snowflake管理者がこの付与を取り消した場合は、再度付与する必要があります。

GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE my_role;
GRANT ROLE my_role TO USER my_user;

重要

REST API リクエストはユーザーの既定のロールを使用するので、そのロールは必要な権限を持っていなければなりません。ユーザーの既定のロールは、:doc:`ALTERUSER ... SETDEFAULT_ROLE</sql-reference/sql/alter-user>`を使用して変更できます。

ALTER USER my_user SET DEFAULT_ROLE=my_role

Limiting access using the Cortex REST API user role¶

To provide selective access to the Cortex REST API for specific users, use the SNOWFLAKE.CORTEX_REST_API_USER database role. This role grants access to the Cortex REST API without granting access to other Cortex features such as Cortex AI functions, Cortex Agent, Cortex Analyst, Cortex Fine-tuning, or Cortex Search.

CORTEX_REST_API_USER is not granted to the PUBLIC role by default. An account administrator must explicitly grant this role to roles that require access to the Cortex REST API. The SNOWFLAKE.CORTEX_REST_API_USER database role can't be granted directly to a user. For more information, see SNOWFLAKE データベースロールの使用.

重要

If your user roles already have the CORTEX_USER role, you must revoke access to the CORTEX_USER role before the CORTEX_REST_API_USER role can take effect as a fine-grained control.

REVOKE DATABASE ROLE SNOWFLAKE.CORTEX_USER FROM ROLE PUBLIC;

To provide access to the Cortex REST API, use the ACCOUNTADMIN role to do the following:

Grant the SNOWFLAKE.CORTEX_REST_API_USER database role to a custom role.
Assign this custom role to users.

The following example creates the custom role cortex_rest_api_role, grants it the CORTEX_REST_API_USER database role, and assigns the role to example_user:

USE ROLE ACCOUNTADMIN;
CREATE ROLE cortex_rest_api_role;
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_REST_API_USER TO ROLE cortex_rest_api_role;

GRANT ROLE cortex_rest_api_role TO USER example_user;

You can also grant access to the Cortex REST API through existing roles. For example, if you have an api_consumer role used by a group of users, you can grant access with a single GRANT statement:

GRANT DATABASE ROLE SNOWFLAKE.CORTEX_REST_API_USER TO ROLE api_consumer;

モデルの可用性¶

以下のテーブルは、各地域のCortexREST APIで利用可能なモデルを示しています。


モデル	クロスクラウド（全地域）	AWS グローバル (リージョン横断)	AWS に US (リージョン横断)	AWS に EU (リージョン横断)	AWS に APJ (リージョン横断)	Azureグローバル (リージョン横断)	Azure US (リージョン横断)	Azure EU (リージョン横断)
`claude-opus-4-7`	*	*	*	*
`claude-sonnet-4-6`	✔	✔	✔	✔	✔
`claude-opus-4-6`	✔	✔	✔	✔
`claude-sonnet-4-5`	✔	✔	✔	✔	✔
`claude-opus-4-5`	✔	✔	✔	✔
`claude-haiku-4-5`	✔	✔	✔	✔	✔
`claude-4-sonnet`	✔	✔	✔	✔	✔
`claude-4-opus`	✔	✔
`claude-3-7-sonnet`	✔	✔
`openai-gpt-5.4`	*					*	*
`openai-gpt-5.2`	✔					✔	✔
`openai-gpt-5.1`	✔					✔	✔	✔
`openai-gpt-5`	*					*	*	*
`openai-gpt-5-mini`	*					*	*
`openai-gpt-5-nano`	*					*	*
`openai-gpt-4.1`	✔					✔	✔
`openai-gpt-oss-120b`	*
`llama4-maverick`	✔	✔	✔
`llama3.1-8b`	✔	✔	✔	✔	✔	✔	✔	✔
`llama3.1-70b`	✔	✔	✔	✔	✔	✔	✔	✔
`llama3.1-405b`	✔	✔	✔			✔	✔
`deepseek-r1`	✔	✔	✔
`mistral-7b`	✔	✔
`mistral-large`	✔	✔
`mistral-large2`	✔	✔	✔	✔	✔	✔	✔	✔
`snowflake-llama-3.3-70b`	✔	✔	✔


モデル	AWS US西部2 （オレゴン）	AWS US東部1 （N.バージニア）	Azure東部 US 2 （バージニア）
`llama4-maverick`	✔
`llama3.1-8b`	✔	✔	✔
`llama3.1-70b`	✔	✔	✔
`llama3.1-405b`	✔	✔	✔
`deepseek-r1`	✔
`mistral-7b`	✔	✔	✔
`mistral-large`	✔	✔	✔
`mistral-large2`	✔	✔	✔
`snowflake-llama-3.3-70b`	✔


モデル	AWSヨーロッパ中部1 （フランクフルト）	AWS ヨーロッパ西部1 （アイルランド）	Azure西ヨーロッパ（オランダ）
`llama3.1-8b`	✔		✔
`llama3.1-70b`	✔	✔	✔
`mistral-7b`	✔		✔
`mistral-large`	✔		✔
`mistral-large2`	✔	✔	✔


モデル	AWS AP 南東部2 (シドニー)	AWS AP 北東部1 （東京）
`llama3.1-8b`		✔
`llama3.1-70b`	✔	✔
`mistral-7b`		✔
`mistral-large`		✔
`mistral-large2`	✔	✔

* プレビュー機能またはモデルを示しています。プレビュー機能は実稼働ワークロードには適していません。

また、対応リージョンであれば、微調整モデルも使用できます。

機能¶

ストリーミング¶

両方のAPIsは、`サーバー送信イベント<https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events>`_を使用したストリーミング応答をサポートしています。

Chat Completionsストリーミング¶

from openai import OpenAI

client = OpenAI(
  api_key="<SNOWFLAKE_PAT>",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
)

response = client.chat.completions.create(
  model="claude-sonnet-4-5",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "How does a snowflake get its unique pattern?"}
  ],
  stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="", flush=True)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "<SNOWFLAKE_PAT>",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
});

const stream = await client.chat.completions.create({
  model: "claude-sonnet-4-5",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "How does a snowflake get its unique pattern?" }
  ],
  stream: true,
});

for await (const event of stream) {
  process.stdout.write(event.choices[0]?.delta?.content || "");
}

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [
      {"role": "user", "content": "How does a snowflake get its unique pattern?"}
    ],
    "stream": true,
    "stream_options": {
      "include_usage": true
    }
  }'

MessagesAPIストリーミング¶

import httpx
import anthropic

PAT = "<SNOWFLAKE_PAT>"

http_client = httpx.Client(
  headers={"Authorization": f"Bearer {PAT}"},
)

client = anthropic.Anthropic(
  api_key="not-used",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  http_client=http_client,
  default_headers={"Authorization": f"Bearer {PAT}"},
)

with client.messages.stream(
  model="claude-sonnet-4-5",
  max_tokens=1024,
  messages=[
    {"role": "user", "content": "How does a snowflake get its unique pattern?"}
  ],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

import Anthropic from "@anthropic-ai/sdk";

const PAT = "<SNOWFLAKE_PAT>";

const client = new Anthropic({
  apiKey: "not-used",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  defaultHeaders: {
    "Authorization": `Bearer ${PAT}`,
  },
});

const stream = client.messages.stream({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  messages: [
    { role: "user", content: "How does a snowflake get its unique pattern?" }
  ],
});

for await (const event of stream) {
  if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
    process.stdout.write(event.delta.text);
  }
}

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      {"role": "user", "content": "How does a snowflake get its unique pattern?"}
    ]
  }'

ツール呼び出し¶

ツール呼び出しにより、モデルは会話中に外部関数を呼び出すことができます。フローは以下のステップで機能します。

利用可能なツールのリストとともにリクエストを送信します。
モデルにより1つまたは複数のツールの呼び出しが決定され、ツール名と引数が返されます。
ユーザー側でツールを実行します。
ツールの結果を送り返すと、モデルにより最終的な応答が生成されます。

ツール呼び出しは、 OpenAI およびClaudeモデルでサポートされています。

Chat Completionsツール呼び出し¶

import json
from openai import OpenAI

client = OpenAI(
  api_key="<SNOWFLAKE_PAT>",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
)

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get the current weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA"
          }
        },
        "required": ["location"]
      }
    }
  }
]

messages = [
  {"role": "user", "content": "What is the weather like in San Francisco?"}
]

# Step 1: Send the request with tools
response = client.chat.completions.create(
  model="claude-sonnet-4-5",
  messages=messages,
  tools=tools,
)

# Step 2: The model responds with tool_calls
message = response.choices[0].message

if message.tool_calls:
    tool_call = message.tool_calls[0]

    # Step 3: Execute the tool (your implementation)
    result = json.dumps({"temperature": "69°F", "condition": "sunny"})

    # Step 4: Send the tool result back
    messages.append(message)
    messages.append({
      "role": "tool",
      "tool_call_id": tool_call.id,
      "content": result,
    })

    final_response = client.chat.completions.create(
      model="claude-sonnet-4-5",
      messages=messages,
      tools=tools,
    )

    print(final_response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "<SNOWFLAKE_PAT>",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
});

const tools = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Get the current weather for a location",
      parameters: {
        type: "object",
        properties: {
          location: {
            type: "string",
            description: "The city and state, e.g. San Francisco, CA"
          }
        },
        required: ["location"]
      }
    }
  }
];

const messages = [
  { role: "user", content: "What is the weather like in San Francisco?" }
];

// Step 1: Send the request with tools
const response = await client.chat.completions.create({
  model: "claude-sonnet-4-5",
  messages,
  tools,
});

// Step 2: The model responds with tool_calls
const message = response.choices[0].message;

if (message.tool_calls) {
  const toolCall = message.tool_calls[0];

  // Step 3: Execute the tool (your implementation)
  const result = JSON.stringify({ temperature: "69°F", condition: "sunny" });

  // Step 4: Send the tool result back
  messages.push(message);
  messages.push({
    role: "tool",
    tool_call_id: toolCall.id,
    content: result,
  });

  const finalResponse = await client.chat.completions.create({
    model: "claude-sonnet-4-5",
    messages,
    tools,
  });

  console.log(finalResponse.choices[0].message.content);
}

ステップ1 - ツールを含むリクエストを送信する

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [
      {"role": "user", "content": "What is the weather like in San Francisco?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              }
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'

モデルは``tool_calls``配列で応答します。

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"San Francisco, CA\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

ステップ2 - ツールを実行し、結果を送り返す

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [
      {"role": "user", "content": "What is the weather like in San Francisco?"},
      {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"San Francisco, CA\"}"
            }
          }
        ]
      },
      {
        "role": "tool",
        "tool_call_id": "call_abc123",
        "content": "{\"temperature\": \"69°F\", \"condition\": \"sunny\"}"
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              }
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'

MessagesAPIツール呼び出し¶

import json
import httpx
import anthropic

PAT = "<SNOWFLAKE_PAT>"

http_client = httpx.Client(
  headers={"Authorization": f"Bearer {PAT}"},
)

client = anthropic.Anthropic(
  api_key="not-used",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  http_client=http_client,
  default_headers={"Authorization": f"Bearer {PAT}"},
)

tools = [
  {
    "name": "get_weather",
    "description": "Get the current weather for a location",
    "input_schema": {
      "type": "object",
      "properties": {
        "location": {
          "type": "string",
          "description": "The city and state, e.g. San Francisco, CA"
        }
      },
      "required": ["location"]
    }
  }
]

messages = [
  {"role": "user", "content": "What is the weather like in San Francisco?"}
]

# Step 1: Send the request with tools
response = client.messages.create(
  model="claude-sonnet-4-5",
  max_tokens=1024,
  messages=messages,
  tools=tools,
)

# Step 2: The model responds with a tool_use block
if response.stop_reason == "tool_use":
    tool_use = next(b for b in response.content if b.type == "tool_use")

    # Step 3: Execute the tool (your implementation)
    result = json.dumps({"temperature": "69°F", "condition": "sunny"})

    # Step 4: Send the tool result back
    messages.append({"role": "assistant", "content": response.content})
    messages.append({
      "role": "user",
      "content": [
        {
          "type": "tool_result",
          "tool_use_id": tool_use.id,
          "content": result,
        }
      ],
    })

    final_response = client.messages.create(
      model="claude-sonnet-4-5",
      max_tokens=1024,
      messages=messages,
      tools=tools,
    )

    print(final_response.content[0].text)

import Anthropic from "@anthropic-ai/sdk";

const PAT = "<SNOWFLAKE_PAT>";

const client = new Anthropic({
  apiKey: "not-used",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  defaultHeaders: {
    "Authorization": `Bearer ${PAT}`,
  },
});

const tools = [
  {
    name: "get_weather",
    description: "Get the current weather for a location",
    input_schema: {
      type: "object",
      properties: {
        location: {
          type: "string",
          description: "The city and state, e.g. San Francisco, CA"
        }
      },
      required: ["location"]
    }
  }
];

const messages = [
  { role: "user", content: "What is the weather like in San Francisco?" }
];

// Step 1: Send the request with tools
const response = await client.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  messages,
  tools,
});

// Step 2: The model responds with a tool_use block
if (response.stop_reason === "tool_use") {
  const toolUse = response.content.find(b => b.type === "tool_use");

  // Step 3: Execute the tool (your implementation)
  const result = JSON.stringify({ temperature: "69°F", condition: "sunny" });

  // Step 4: Send the tool result back
  messages.push({ role: "assistant", content: response.content });
  messages.push({
    role: "user",
    content: [
      {
        type: "tool_result",
        tool_use_id: toolUse.id,
        content: result,
      }
    ],
  });

  const finalResponse = await client.messages.create({
    model: "claude-sonnet-4-5",
    max_tokens: 1024,
    messages,
    tools,
  });

  console.log(finalResponse.content[0].text);
}

ステップ1 - ツールを含むリクエストを送信する

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "What is the weather like in San Francisco?"}
    ],
    "tools": [
      {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "input_schema": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            }
          },
          "required": ["location"]
        }
      }
    ]
  }'

モデルは``tool_use``コンテンツブロックで応答します。

{
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "I'll check the weather for you."
    },
    {
      "type": "tool_use",
      "id": "toolu_abc123",
      "name": "get_weather",
      "input": {"location": "San Francisco, CA"}
    }
  ],
  "stop_reason": "tool_use"
}

ステップ2 - ツールを実行し、結果を送り返す

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "What is the weather like in San Francisco?"},
      {
        "role": "assistant",
        "content": [
          {"type": "text", "text": "I'\''ll check the weather for you."},
          {
            "type": "tool_use",
            "id": "toolu_abc123",
            "name": "get_weather",
            "input": {"location": "San Francisco, CA"}
          }
        ]
      },
      {
        "role": "user",
        "content": [
          {
            "type": "tool_result",
            "tool_use_id": "toolu_abc123",
            "content": "{\"temperature\": \"69°F\", \"condition\": \"sunny\"}"
          }
        ]
      }
    ],
    "tools": [
      {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "input_schema": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            }
          },
          "required": ["location"]
        }
      }
    ]
  }'

構造化出力¶

You can request structured JSON output that conforms to a specific schema. Both the Chat Completions API and the Messages API support structured output.

Chat Completionsの構造化出力¶

モデルの出力を制約するには、JSONスキーマで``response_format``フィールドを使用します。

import json
from openai import OpenAI

client = OpenAI(
  api_key="<SNOWFLAKE_PAT>",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
)

response = client.chat.completions.create(
  model="claude-sonnet-4-5",
  messages=[
    {"role": "user", "content": "Create a dataset of 3 people with their names and ages."}
  ],
  response_format={
    "type": "json_schema",
    "json_schema": {
      "name": "people_data",
      "schema": {
        "type": "object",
        "properties": {
          "people": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "name": {"type": "string"},
                "age": {"type": "number"}
              },
              "required": ["name", "age"]
            }
          }
        },
        "required": ["people"]
      }
    }
  }
)

data = json.loads(response.choices[0].message.content)
print(data)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "<SNOWFLAKE_PAT>",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
});

const response = await client.chat.completions.create({
  model: "claude-sonnet-4-5",
  messages: [
    { role: "user", content: "Create a dataset of 3 people with their names and ages." }
  ],
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "people_data",
      schema: {
        type: "object",
        properties: {
          people: {
            type: "array",
            items: {
              type: "object",
              properties: {
                name: { type: "string" },
                age: { type: "number" }
              },
              required: ["name", "age"]
            }
          }
        },
        required: ["people"]
      }
    }
  }
});

const data = JSON.parse(response.choices[0].message.content);
console.log(data);

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [
      {"role": "user", "content": "Create a dataset of 3 people with their names and ages."}
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "people_data",
        "schema": {
          "type": "object",
          "properties": {
            "people": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "name": {"type": "string"},
                  "age": {"type": "number"}
                },
                "required": ["name", "age"]
              }
            }
          },
          "required": ["people"]
        }
      }
    }
  }'

注釈

Claudeモデルは応答形式タイプとして``json_schema``のみをサポートします。OpenAIモデルは、`OpenAIAPIリファレンス<https://platform.openai.com/docs/api-reference/chat/create>`_に記載されている追加の応答形式タイプをサポートします。

MessagesAPI構造化出力¶

Use the output_config parameter with a JSON schema to constrain the model's output. The response contains valid JSON in a text content block that matches your schema.

import json
import httpx
import anthropic

PAT = "<SNOWFLAKE_PAT>"

http_client = httpx.Client(
  headers={"Authorization": f"Bearer {PAT}"},
)

client = anthropic.Anthropic(
  api_key="not-used",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  http_client=http_client,
  default_headers={"Authorization": f"Bearer {PAT}"},
)

response = client.messages.create(
  model="claude-sonnet-4-5",
  max_tokens=1024,
  messages=[
    {"role": "user", "content": "Create a dataset of 3 people with their names and ages."}
  ],
  output_config={
    "format": {
      "type": "json_schema",
      "schema": {
        "type": "object",
        "properties": {
          "people": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "name": {"type": "string"},
                "age": {"type": "number"}
              },
              "required": ["name", "age"]
            }
          }
        },
        "required": ["people"],
        "additionalProperties": False
      }
    }
  },
)

data = json.loads(response.content[0].text)
print(data)

import Anthropic from "@anthropic-ai/sdk";

const PAT = "<SNOWFLAKE_PAT>";

const client = new Anthropic({
  apiKey: "not-used",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  defaultHeaders: {
    "Authorization": `Bearer ${PAT}`,
  },
});

const response = await client.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  messages: [
    { role: "user", content: "Create a dataset of 3 people with their names and ages." }
  ],
  output_config: {
    format: {
      type: "json_schema",
      schema: {
        type: "object",
        properties: {
          people: {
            type: "array",
            items: {
              type: "object",
              properties: {
                name: { type: "string" },
                age: { type: "number" }
              },
              required: ["name", "age"]
            }
          }
        },
        required: ["people"],
        additionalProperties: false
      }
    }
  },
});

const data = JSON.parse(response.content[0].text);
console.log(data);

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Create a dataset of 3 people with their names and ages."}
    ],
    "output_config": {
      "format": {
        "type": "json_schema",
        "schema": {
          "type": "object",
          "properties": {
            "people": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "name": {"type": "string"},
                  "age": {"type": "number"}
                },
                "required": ["name", "age"]
              }
            }
          },
          "required": ["people"],
          "additionalProperties": false
        }
      }
    }
  }'

画像入力¶

ビジョンをサポートするモデルのリクエストに画像を含めることができます。画像はbase64エンコード文字列として提供する必要があります。画像は会話ごとに20枚まで、最大リクエストサイズは20 MiB です。

画像入力は以下によりサポートされています。

Claudeモデル（``claude-3-7-sonnet``以降）
OpenAI models (openai-gpt-4.1 and newer)

Chat Completionsの画像入力¶

import base64
from openai import OpenAI

client = OpenAI(
  api_key="<SNOWFLAKE_PAT>",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
)

# Read and encode an image file
with open("image.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")

response = client.chat.completions.create(
  model="claude-sonnet-4-5",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": f"data:image/png;base64,{image_data}"
          }
        },
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ]
)

print(response.choices[0].message.content)

import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: "<SNOWFLAKE_PAT>",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
});

// Read and encode an image file
const imageData = fs.readFileSync("image.png").toString("base64");

const response = await client.chat.completions.create({
  model: "claude-sonnet-4-5",
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image_url",
          image_url: {
            url: `data:image/png;base64,${imageData}`
          }
        },
        {
          type: "text",
          text: "What is in this image?"
        }
      ]
    }
  ],
});

console.log(response.choices[0].message.content);

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/png;base64,<BASE64_IMAGE_DATA>"
            }
          },
          {
            "type": "text",
            "text": "What is in this image?"
          }
        ]
      }
    ]
  }'

MessagesAPI画像入力¶

MessagesAPIは異なる画像形式を使用します。データURLの代わりに、type、media_type、および``data``フィールドを持つ``source``ブロックを使用します。

import base64
import httpx
import anthropic

PAT = "<SNOWFLAKE_PAT>"

http_client = httpx.Client(
  headers={"Authorization": f"Bearer {PAT}"},
)

client = anthropic.Anthropic(
  api_key="not-used",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  http_client=http_client,
  default_headers={"Authorization": f"Bearer {PAT}"},
)

# Read and encode an image file
with open("image.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")

response = client.messages.create(
  model="claude-sonnet-4-5",
  max_tokens=1024,
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "image",
          "source": {
            "type": "base64",
            "media_type": "image/png",
            "data": image_data
          }
        },
        {
          "type": "text",
          "text": "What is in this image?"
        }
      ]
    }
  ],
)

print(response.content[0].text)

import Anthropic from "@anthropic-ai/sdk";
import fs from "fs";

const PAT = "<SNOWFLAKE_PAT>";

const client = new Anthropic({
  apiKey: "not-used",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  defaultHeaders: {
    "Authorization": `Bearer ${PAT}`,
  },
});

// Read and encode an image file
const imageData = fs.readFileSync("image.png").toString("base64");

const response = await client.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: [
        {
          type: "image",
          source: {
            type: "base64",
            media_type: "image/png",
            data: imageData
          }
        },
        {
          type: "text",
          text: "What is in this image?"
        }
      ]
    }
  ],
});

console.log(response.content[0].text);

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image",
            "source": {
              "type": "base64",
              "media_type": "image/png",
              "data": "<BASE64_IMAGE_DATA>"
            }
          },
          {
            "type": "text",
            "text": "What is in this image?"
          }
        ]
      }
    ]
  }'

プロンプトキャッシング¶

プロンプトのキャッシュにより、リクエスト全体で以前に処理されたコンテキスト（大規模なシステムプロンプト、ドキュメント、会話履歴など）を再利用でき、レイテンシとコストが削減されます。

OpenAIモデル：キャッシュは**暗黙的**です。1,024+トークンのプロンプトは自動的にキャッシュされます。リクエストの変更は必要ありません。
Claudeモデル：キャッシュは**明示的**です。キャッシュするコンテンツブロックに``cache_control``ブレークポイントを追加します。**5分間のTTL**を持つ``ephemeral``キャッシュタイプのみがサポートされています。リクエストごとに最大4つのキャッシュブレークポイントを設定できます。

Chat Completionsプロンプトキャッシュ¶

Chat Completionsを介したClaudeモデルの場合、コンテンツブロックに``cache_control``を追加します。OpenAIモデルは自動的にキャッシュされ、このフィールドは必要ありません。

from openai import OpenAI

client = OpenAI(
  api_key="<SNOWFLAKE_PAT>",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
)

response = client.chat.completions.create(
  model="claude-sonnet-4-5",
  messages=[
    {
      "role": "system",
      "content": [
        {
          "type": "text",
          "text": "<long system prompt to cache>",
          "cache_control": {"type": "ephemeral"}
        }
      ]
    },
    {"role": "user", "content": "Summarize the key points."}
  ]
)

print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "<SNOWFLAKE_PAT>",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
});

const response = await client.chat.completions.create({
  model: "claude-sonnet-4-5",
  messages: [
    {
      role: "system",
      content: [
        {
          type: "text",
          text: "<long system prompt to cache>",
          cache_control: { type: "ephemeral" }
        }
      ]
    },
    { role: "user", content: "Summarize the key points." }
  ],
});

console.log(response.choices[0].message.content);

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [
      {
        "role": "system",
        "content": [
          {
            "type": "text",
            "text": "<long system prompt to cache>",
            "cache_control": {"type": "ephemeral"}
          }
        ]
      },
      {"role": "user", "content": "Summarize the key points."}
    ]
  }'

MessagesAPIプロンプトキャッシュ¶

システムまたはユーザーのコンテンツブロックで``cache_control``を使用します。5分間のTTLを持つ``ephemeral``キャッシュタイプのみがサポートされています。リクエストごとに最大4個のキャッシュブレークポイントを設定できます。

import httpx
import anthropic

PAT = "<SNOWFLAKE_PAT>"

http_client = httpx.Client(
  headers={"Authorization": f"Bearer {PAT}"},
)

client = anthropic.Anthropic(
  api_key="not-used",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  http_client=http_client,
  default_headers={"Authorization": f"Bearer {PAT}"},
)

response = client.messages.create(
  model="claude-sonnet-4-5",
  max_tokens=1024,
  system=[
    {
      "type": "text",
      "text": "<long system prompt to cache>",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  messages=[
    {"role": "user", "content": "Summarize the key points."}
  ],
)

print(response.content[0].text)

import Anthropic from "@anthropic-ai/sdk";

const PAT = "<SNOWFLAKE_PAT>";

const client = new Anthropic({
  apiKey: "not-used",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  defaultHeaders: {
    "Authorization": `Bearer ${PAT}`,
  },
});

const response = await client.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 1024,
  system: [
    {
      type: "text",
      text: "<long system prompt to cache>",
      cache_control: { type: "ephemeral" }
    }
  ],
  messages: [
    { role: "user", content: "Summarize the key points." }
  ],
});

console.log(response.content[0].text);

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 1024,
    "system": [
      {
        "type": "text",
        "text": "<long system prompt to cache>",
        "cache_control": {"type": "ephemeral"}
      }
    ],
    "messages": [
      {"role": "user", "content": "Summarize the key points."}
    ]
  }'

注釈

Anthropic prompt caching uses a 5-minute TTL. Cached content not accessed within the TTL window is evicted. OpenAI prompt caching is implicit and managed automatically — no cache_control fields needed.

思考と推論¶

Chat Completions reasoning¶

For Claude models, use the reasoning object. For OpenAI reasoning models, use the reasoning_effort field (values: none, minimal, low, medium, high).

from openai import OpenAI

client = OpenAI(
  api_key="<SNOWFLAKE_PAT>",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
)

# Claude models — use the reasoning object
response = client.chat.completions.create(
  model="claude-sonnet-4-5",
  temperature=1,
  messages=[
    {"role": "user", "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"}
  ],
  extra_body={
    "reasoning": {"effort": "high"}
  }
)

print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "<SNOWFLAKE_PAT>",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1"
});

// Claude models — use the reasoning object
const response = await client.chat.completions.create({
  model: "claude-sonnet-4-5",
  temperature: 1,
  messages: [
    { role: "user", content: "Are there an infinite number of prime numbers such that n mod 4 == 3?" }
  ],
  reasoning: { effort: "high" },
});

console.log(response.choices[0].message.content);

# Claude models — use the reasoning object
curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -d '{
    "model": "claude-sonnet-4-5",
    "temperature": 1,
    "messages": [
      {"role": "user", "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"}
    ],
    "reasoning": {
      "effort": "high"
    }
  }'

# OpenAI reasoning models — use reasoning_effort
curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -d '{
    "model": "openai-gpt-5",
    "messages": [
      {"role": "user", "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"}
    ],
    "reasoning_effort": "high"
  }'

MessagesAPI思考¶

一部のClaudeモデルは**適応型思考**をサポートしており、タスクの複雑さに基づいてモデルが適用する推論の量を調整します。次のモデルは適応型思考をサポートしています。

claude-opus-4-6 and newer
claude-sonnet-4-6

MessagesAPIの場合は、``type: "adaptive"``で``thinking``パラメーターを使用して、適応型思考を有効にします。``output_config.effort``パラメーターは思考の深度を高レベルで制御し、以下の値を受け入れます。


エフォートレベル	動作
`max`	Always thinks with no constraints on thinking depth.
`high` （デフォルト）	常に思考する。複雑なタスクに対し深い推論を提供する。
`medium`	中程度の思考。非常に単純なクエリについては、思考をスキップする場合がある。
`low`	最小限の思考。スピードが極めて重要とされる単純なタスクについて思考をスキップする。

次の例は、適応型思考を有効にしてMessages API呼び出しを行う方法を示しています。

import httpx
import anthropic

PAT = "<SNOWFLAKE_PAT>"

http_client = httpx.Client(
  headers={"Authorization": f"Bearer {PAT}"},
)

client = anthropic.Anthropic(
  api_key="not-used",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  http_client=http_client,
  default_headers={"Authorization": f"Bearer {PAT}"},
)

response = client.messages.create(
  model="claude-opus-4-6",
  max_tokens=16384,
  thinking={
    "type": "adaptive"
  },
  messages=[
    {"role": "user", "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"}
  ],
)

# The response includes thinking blocks followed by text
for block in response.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking[:100]}...")
    elif block.type == "text":
        print(f"Answer: {block.text}")

import Anthropic from "@anthropic-ai/sdk";

const PAT = "<SNOWFLAKE_PAT>";

const client = new Anthropic({
  apiKey: "not-used",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  defaultHeaders: {
    "Authorization": `Bearer ${PAT}`,
  },
});

const response = await client.messages.create({
  model: "claude-opus-4-6",
  max_tokens: 16384,
  thinking: {
    type: "adaptive"
  },
  messages: [
    { role: "user", content: "Are there an infinite number of prime numbers such that n mod 4 == 3?" }
  ],
});

// The response includes thinking blocks followed by text
for (const block of response.content) {
  if (block.type === "thinking") {
    console.log(`Thinking: ${block.thinking.slice(0, 100)}...`);
  } else if (block.type === "text") {
    console.log(`Answer: ${block.text}`);
  }
}

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-opus-4-6",
    "max_tokens": 16384,
    "thinking": {
      "type": "adaptive"
    },
    "messages": [
      {"role": "user", "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"}
    ]
  }'

応答には、要約された思考と思考シグネチャを持つ思考ブロックが含まれます。推論コンテキストを維持するために、マルチターン会話でこれらのブロックを送り返します。

{
  "role": "assistant",
  "content": [
    {"type": "thinking", "thinking": "<thinking>", "signature": "<signature>"},
    {"type": "text", "text": "Yes, there are infinitely many primes p where p ≡ 3 (mod 4)..."}
  ]
}

適応型思考のMessages APIサポートについて詳しくは、`Claude APIドキュメント --- 適応型思考<https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking>`__を参照してください。

ベータ機能（Messages API）¶

Messages APIは、``anthropic-beta``ヘッダーを介してAnthropicベータ機能をサポートします。1つ以上のベータヘッダー値をコンマ区切りの文字列として渡します。

サポートされているベータヘッダー¶
ベータヘッダー値	機能
`token-efficient-tools-2025-02-19`	トークン効率の良いツール
`interleaved-thinking-2025-05-14`	インターリーブ型思考
`output-128k-2025-02-19`	最大128Kの出力トークンを有効にする
`dev-full-thinking-2025-05-14`	Claude 4+モデルにおける生の思考のための開発者モード
`context-management-2025-06-27`	コンテキスト管理
`effort-2025-11-24`	思考のためのエフォートパラメーター
`tool-search-tool-2025-10-19`	ツール検索ツール
`tool-examples-2025-10-29`	ツールの使用例

The following example demonstrates using tool examples with claude-sonnet-4-6:

import httpx
import anthropic

PAT = "<SNOWFLAKE_PAT>"

http_client = httpx.Client(
  headers={"Authorization": f"Bearer {PAT}"},
)

client = anthropic.Anthropic(
  api_key="not-used",
  base_url="https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  http_client=http_client,
  default_headers={
    "Authorization": f"Bearer {PAT}",
  },
)

response = client.beta.messages.create(
  model="claude-sonnet-4-6",
  max_tokens=8192,
  betas=["tool-examples-2025-10-29"],
  tools=[
    {
      "name": "get_weather",
      "description": "Get the current weather for a location",
      "input_schema": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA"
          }
        },
        "required": ["location"]
      },
      "examples": [
        {
          "input": {"location": "San Francisco, CA"},
          "output": {"temperature": "65°F", "condition": "sunny"}
        }
      ]
    }
  ],
  messages=[
    {"role": "user", "content": "What's the weather in New York?"}
  ],
)

print(f"Stop reason: {response.stop_reason}")
if response.stop_reason == "tool_use":
  tool_use = next(b for b in response.content if b.type == "tool_use")
  print(f"Tool called: {tool_use.name}")
  print(f"Arguments: {tool_use.input}")

import Anthropic from "@anthropic-ai/sdk";

const PAT = "<SNOWFLAKE_PAT>";

const client = new Anthropic({
  apiKey: "not-used",
  baseURL: "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex",
  defaultHeaders: {
    "Authorization": `Bearer ${PAT}`,
  },
});

const response = await client.beta.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 8192,
  betas: ["tool-examples-2025-10-29"],
  tools: [
    {
      name: "get_weather",
      description: "Get the current weather for a location",
      input_schema: {
        type: "object",
        properties: {
          location: {
            type: "string",
            description: "The city and state, e.g. San Francisco, CA"
          }
        },
        required: ["location"]
      },
      examples: [
        {
          input: { location: "San Francisco, CA" },
          output: { temperature: "65°F", condition: "sunny" }
        }
      ]
    }
  ],
  messages: [
    { role: "user", content: "What's the weather in New York?" }
  ],
});

console.log(`Stop reason: ${response.stop_reason}`);
if (response.stop_reason === "tool_use") {
  const toolUse = response.content.find(b => b.type === "tool_use");
  console.log(`Tool called: ${toolUse.name}`);
  console.log(`Arguments: ${JSON.stringify(toolUse.input)}`);
}

curl "https://<account-identifier>.snowflakecomputing.com/api/v2/cortex/v1/messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <SNOWFLAKE_PAT>" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: tool-examples-2025-10-29" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 8192,
    "tools": [
      {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "input_schema": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            }
          },
          "required": ["location"]
        },
        "examples": [
          {
            "input": {"location": "San Francisco, CA"},
            "output": {"temperature": "65°F", "condition": "sunny"}
          }
        ]
      }
    ],
    "messages": [
      {"role": "user", "content": "What'\''s the weather in New York?"}
    ]
  }'

コンマ区切りの文字列を渡すことで、複数のベータ機能を組み合わせることができます。

-H "anthropic-beta: tool-examples-2025-10-29,tool-search-tool-2025-10-19"

Chat Completions APIリファレンス¶

POST /api/v2/cortex/v1/chat/completions¶

指定されたモデルを使用してチャット補完を生成します。リクエストと応答の形式は、`OpenAI Chat Completions API仕様<https://platform.openai.com/docs/api-reference/chat/create>`_に従います。

POST https://<account_identifier>.snowflakecomputing.com/api/v2/cortex/v1/chat/completions

必須ヘッダー¶

Authorization: Bearer token: リクエストの承認。token`はJSONウェブトークン（JWT）、OAuthトークン、または:doc:`プログラムによるアクセストークン</user-guide/programmatic-access-tokens>`です。詳細については、 :doc:/developer-guide/snowflake-rest-api/authentication` をご参照ください。
Content-Type: application/json: リクエストボディが JSON 形式であることを指定します。

任意のヘッダー¶

X-Snowflake-Authorization-Token-Type: type

認可トークンのタイプを定義します。

X-Snowflake-Authorization-Token-Type ヘッダーを省略した場合、Snowflakeはトークンを調べてトークンのタイプを決定します。

このヘッダーはオプションですが、このヘッダーを指定することもできます。ヘッダーは以下の値のいずれかにセットできます。

KEYPAIR_JWT （キーペア認証用）
OAUTH （ OAuthの場合）
PROGRAMMATIC_ACCESS_TOKEN （プログラムアクセストークン用）

Accept: application/json, text/event-stream

応答に JSON （エラーケース）またはサーバー送信イベントが含まれることを指定します。

必須JSONフィールド¶


フィールド	型	説明
`model`	string	使用するモデル（:ref:`label-cortex_complete_llm_model_availability`を参照）。また、:samp:`{database}.{schema}.{model}`の形式で任意の:doc:`微調整済み</user-guide/snowflake-cortex/cortex-finetuning>`モデルの完全修飾名を使用することもできます。
`messages`	array	会話を表すメッセージオブジェクトの配列。各メッセージには``role``（`system`、`user`、`assistant`、または``tool``）および``content``（コンテンツ部分の文字列または配列）が必要です。

一般的に使用されるオプションのJSONフィールド¶


フィールド	型	デフォルト	説明
`max_completion_tokens`	integer	4096	応答の最大トークン。理論上の最大値は131,072です。各モデルには独自の出力制限があります。
`temperature`	number	モデルにより異なる	ランダム性を制御する。0から2までの値。
`top_p`	number	1.0	nucleusサンプリングによって多様性を制御する。
`stream`	boolean	false	部分的な進行状況をサーバー送信イベントとしてストリームバックするかどうか。
`tools`	array	null	モデルが呼び出すことができるツールのリスト。各ツールには``type: "function"、および``name、`description`、``parameters``を持つ``function``オブジェクトが必要です。
`tool_choice`	文字列またはオブジェクト	`"auto"`	モデルがツールを選択する方法を制御する。オプション：`"auto"`、`"required"`、`"none"`、または特定の機能を指定するオブジェクト。
`response_format`	object	null	出力形式を制約する。構造化出力には``{"type": "json_schema", "json_schema": {...}}``を使用する。
`reasoning_effort`	string	null	For OpenAI reasoning models. Values: `none`, `"minimal"`, `"low"`, `"medium"`, `"high"`.
`reasoning`	object	null	Claudeモデル用。思考を有効にするには、``reasoning.effort``または``reasoning.max_tokens``を設定する。

モデルファミリーごとにサポートされているフィールドの完全なリストについては、:ref:`詳細な互換性チャート<label-cortex_openai_sdk_compatibility>`を参照してください。

ステータスコード¶

200 OK: リクエストは正常に完了しました。
400 invalid options object: オプションの引数は無効な値です。
400 unknown model model_name: 指定されたモデルは存在しません。
400 schema validation failed: 応答スキーマ構造が正しくありません。
400 max tokens of count exceeded: リクエストはモデルがサポートするトークンの最大数を超えました。
400 all requests were throttled by remote service: リクエストはスロットルされました。後でもう一度お試しください。
402 budget exceeded: モデルの消費予算を超えました。
403 Not Authorized: アカウントが REST API で有効になっていないか、呼び出し元ユーザーの既定のロールが snowflake.cortex_user データベースロールを持っていません。
429 too many requests: 使用量のクォータを超えました。後でもう一度お試しください。
503 inference timed out: リクエストに時間がかかりすぎました。

制限事項¶

未設定の場合、 max_completion_tokens のデフォルトは4096です。各モデルには、独自の出力トークンの制限があります。
ツール呼び出しは、OpenAIおよびClaudeモデルでのみサポートされています。
オーディオはサポートされていません。
画像認識は、 OpenAI およびClaudeモデルのみサポートされています。画像は会話ごとに20枚まで、最大リクエストサイズは20 MiB です。
プロンプトキャッシュ用の短期キャッシュ制御ポイントは、Claudeモデルのみでサポートされています。OpenAI モデルは暗黙的なキャッシュをサポートしています。
Claudeモデルのみが、応答で推論の詳細を返す機能をサポートしています。
The temperature field is ignored for Claude Opus 4.7 since the model no longer supports temperature.
max_tokens``は非推奨です。代わりに ``max_completion_tokens を使用してください。
エラーメッセージは、モデルプロバイダーではなく、Snowflakeによって生成されます。

詳細な互換性チャート¶

以下の表は、異なるSnowflakeホストモデルファミリーでChat Completions APIを使用する場合に、どのリクエストおよび応答フィールドがサポートされるかをまとめたものです。

リクエストフィールド¶
フィールド	OpenAI モデル	Claudeモデル	その他のモデル
`model`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`messages`	サブフィールドを見る	サブフィールドを見る	サブフィールドを見る
`messages[].audio`	❌ エラー	❌ は無視されました	❌ は無視されました
`messages[].role`	✔ サポート対象	✔ ユーザー/アシスタント/システムのみ	✔ ユーザー/アシスタント/システムのみ
`messages[].content` （string）	✔ サポート対象	✔ サポート対象	✔ サポート対象
:code:`messages[].content[]`（配列）	サブフィールドを見る	サブフィールドを見る	サブフィールドを見る
`messages[].content[].text`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`messages[].content[].type`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`messages[].content[].image_url`	✔ サポート対象	✔ サポート対象	❌ エラー
`messages[].content[].cache_control`	❌ は無視されました	✔ サポート対象（短期のみ）	❌ は無視されました
`messages[].content[].file`	❌ エラー	❌ エラー	❌ は無視されました
`messages[].content[].input_audio`	❌ エラー	❌ は無視されました	❌ は無視されました
`messages[].content[].refusal`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`messages[].function_call`	✔ サポート対象（非推奨）	❌ は無視されました	❌ は無視されました
`messages[].name`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`messages[].refusal`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`messages[].tool_call_id`	✔ サポート対象	✔ サポート対象	❌ は無視されました
`messages[].tool_calls`	✔ サポート対象	✔ `function` ツールのみ	❌ は無視されました
`messages[].reasoning_details`	❌ は無視されました	✔ OpenRouter 形式 `reasoning.text`	❌ は無視されました
`audio`	❌ エラー	❌ は無視されました	❌ は無視されました
`frequency_penalty`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`logit_bias`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`logprobs`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`max_tokens`	❌ エラー（非推奨）	❌ エラー（非推奨）	❌ エラー（非推奨）
`max_completion_tokens`	✔ サポート対象（デフォルト4096、最大131072）	✔ サポート対象（デフォルト4096、最大131072）	✔ サポート対象（デフォルト4096、最大131072）
`metadata`	❌ は無視されました	❌ は無視されました	❌ は無視されました
`modalities`	❌ は無視されました	❌ は無視されました	❌ は無視されました
`n`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`parallel_tool_calls`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`prediction`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`presence_penalty`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`prompt_cache_key`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`reasoning_effort`	✔ サポート対象	❌ は無視されました（`reasoning` オブジェクトを使用）	❌ は無視されました
`reasoning`	サブフィールドを見る	サブフィールドを見る	サブフィールドを見る
`reasoning.effort`	✔ サポート対象（`reasoning_effort` をオーバーライド）	✔ は `reasoning.max_tokens` に変換されます	❌ は無視されました
`reasoning.max_tokens`	❌ は無視されました	✔ サポート対象	❌ は無視されました
`response_format`	✔ サポート対象	✔ `json_schema` のみ	❌ は無視されました
`safety_identifier`	❌ は無視されました	❌ は無視されました	❌ は無視されました
`service_tier`	❌ エラー	❌ エラー	❌ エラー
`stop`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`store`	❌ エラー	❌ エラー	❌ エラー
`stream`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`stream_options`	サブフィールドを見る	サブフィールドを見る	サブフィールドを見る
`stream_options.include_obfuscation`	❌ は無視されました	❌ は無視されました	❌ は無視されました
`stream_options.include_usage`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`temperature`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`tool_choice`	✔ サポート対象	✔ `function` ツールのみ	❌ は無視されました
`tools`	✔ サポート対象	✔ `function` ツールのみ	❌ エラー
`top_logprobs`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`top_p`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`verbosity`	✔ サポート対象	❌ は無視されました	❌ は無視されました
`web_search_options`	❌ エラー	❌ は無視されました	❌ は無視されました

応答フィールド¶
フィールド	OpenAI モデル	Claudeモデル	その他のモデル
`id`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`object`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`created`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`model`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`choices`	サブフィールドを見る	サブフィールドを見る	サブフィールドを見る
`choices[].index`	✔ サポート対象	✔ 単一選択のみ	✔ 単一選択のみ
`choices[].finish_reason`	✔ サポート対象	❌ サポート対象外	✔ `stop` のみ
`choices[].logprobs`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外
:code:`choices[].message`（非ストリーミング）	サブフィールドを見る	サブフィールドを見る	サブフィールドを見る
`choices[].message.content`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`choices[].message.role`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`choices[].message.refusal`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外
`choices[].message.annotations`	❌ サポート対象外	❌ サポート対象外	❌ サポート対象外
`choices[].message.audio`	❌ サポート対象外	❌ サポート対象外	❌ サポート対象外
`choices[].message.function_call`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外
`choices[].message.tool_calls`	✔ サポート対象	✔ `function` ツールのみ	❌ サポート対象外
`choices[].message.reasoning`	❌ サポート対象外	✔ OpenRouter 形式	❌ サポート対象外
:code:`choices[].delta`（ストリーミング）	サブフィールドを見る	サブフィールドを見る	サブフィールドを見る
`choices[].delta.content`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`choices[].delta.role`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外
`choices[].delta.refusal`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外
`choices[].delta.function_call`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外
`choices[].delta.tool_calls`	✔ サポート対象	✔ `function` ツールのみ	❌ サポート対象外
`choices[].delta.reasoning`	❌ サポート対象外	✔ OpenRouter 形式	❌ サポート対象外
`usage`	サブフィールドを見る	サブフィールドを見る	サブフィールドを見る
`usage.prompt_tokens`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`usage.completion_tokens`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`usage.total_tokens`	✔ サポート対象	✔ サポート対象	✔ サポート対象
`usage.prompt_tokens_details`	サブフィールドを見る	サブフィールドを見る	サブフィールドを見る
`usage.prompt_tokens_details.audio_tokens`	❌ サポート対象外	❌ サポート対象外	❌ サポート対象外
`usage.prompt_tokens_details.cached_tokens`	✔ キャッシュ読み取りのみ	✔ キャッシュ読み取り＋書き込み	❌ サポート対象外
`usage.completion_tokens_details`	サブフィールドを見る	サブフィールドを見る	サブフィールドを見る
`usage.completion_tokens_details.accepted_prediction_tokens`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外
`usage.completion_tokens_details.audio_tokens`	❌ サポート対象外	❌ サポート対象外	❌ サポート対象外
`usage.completion_tokens_details.reasoning_tokens`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外
`usage.completion_tokens_details.rejected_prediction_tokens`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外
`service_tier`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外
`system_fingerprint`	✔ サポート対象	❌ サポート対象外	❌ サポート対象外

リクエストヘッダー¶
ヘッダー	サポート
`Authorization`	✔ 必須
`Content-Type`	✔ サポート対象（`application/json`）
`Accept`	✔ サポート対象（`application/json`、 `text/event-stream`）

応答のヘッダ¶
ヘッダー	サポート
`openai-organization`	❌ サポート対象外
`openai-version`	❌ サポート対象外
`openai-processing-ms`	❌ サポート対象外
`x-ratelimit-limit-requests`	❌ サポート対象外
`x-ratelimit-limit-tokens`	❌ サポート対象外
`x-ratelimit-remaining-requests`	❌ サポート対象外
`x-ratelimit-remaining-tokens`	❌ サポート対象外
`x-ratelimit-reset-requests`	❌ サポート対象外
`x-ratelimit-reset-tokens`	❌ サポート対象外
`retry-after`	❌ サポート対象外

詳細¶

その他の使用例について詳しくは、`OpenAI Chat Completions API reference<https://platform.openai.com/docs/guides/completions/>`_または`OpenAI Cookbook<https://cookbook.openai.com/>`_を参照してください。

Chat Completions API との互換性に加え、SnowflakeはClaudeモデルの OpenRouter 互換機能をサポートしています。これらの機能はリクエストの追加フィールドとして公開されます。

プロンプトキャッシングには、 cache_control フィールドを使用します。`OpenRouterプロンプトキャッシュドキュメント<https://openrouter.ai/docs/features/prompt-caching>`_を参照してください。
推論トークンには、 reasoning フィールドを使用します。`OpenRouter推論ドキュメント<https://openrouter.ai/docs/use-cases/reasoning-tokens>`_を参照してください。

Messages APIリファレンス¶

POST /api/v2/cortex/v1/messages¶

Claudeモデルを使用して応答を生成します。リクエストと応答の形式は、`Anthropic Messages API仕様<https://docs.anthropic.com/en/api/messages>`_に従います。

POST https://<account_identifier>.snowflakecomputing.com/api/v2/cortex/v1/messages

注釈

Messages APIは**Claudeモデルのみ**をサポートしています。その他のモデルの場合は、Chat Completions APIを使用してください。

必須ヘッダー¶

Authorization: Bearer token: リクエストの承認。token`はJSONウェブトークン（JWT）、OAuthトークン、または:doc:`プログラムによるアクセストークン</user-guide/programmatic-access-tokens>`です。詳細については、 :doc:/developer-guide/snowflake-rest-api/authentication` をご参照ください。
Content-Type: application/json: リクエストボディが JSON 形式であることを指定します。
anthropic-version: 2023-06-01: 必須のAnthropic APIバージョンヘッダー。

任意のヘッダー¶

X-Snowflake-Authorization-Token-Type: type

認可トークンのタイプを定義します。

X-Snowflake-Authorization-Token-Type ヘッダーを省略した場合、Snowflakeはトークンを調べてトークンのタイプを決定します。

このヘッダーはオプションですが、このヘッダーを指定することもできます。ヘッダーは以下の値のいずれかにセットできます。

KEYPAIR_JWT （キーペア認証用）
OAUTH （ OAuthの場合）
PROGRAMMATIC_ACCESS_TOKEN （プログラムアクセストークン用）

anthropic-beta: feature

ベータ機能を有効にする。Bedrock互換のベータヘッダーのみがサポートされています。

必須JSONフィールド¶


フィールド	型	説明
`model`	string	使用するClaudeモデル（:ref:`label-cortex_complete_llm_model_availability`を参照）。
`max_tokens`	integer	生成するトークンの最大数。
`messages`	array	メッセージオブジェクトの配列。各メッセージには``role``（user``または``assistant）および``content``（文字列またはコンテンツブロックの配列）があります。

サポートされている機能¶

Messages APIは、Claudeモデル向けの標準のAnthropic Messages API機能セットをサポートしています。これには以下が含まれます。

テキスト生成とマルチターン会話
ストリーミング（"stream": true）
システムメッセージ（トップレベルの``system``フィールド経由）
ツール呼び出し（name、description、``input_schema``を使用するAnthropic形式）
Structured output (output_config with json_schema)
画像入力（base64ソースブロック）
プロンプトキャッシュ（コンテンツブロックに対する``cache_control``）
Adaptive thinking (thinking parameter with type: "adaptive") and extended thinking (budget_tokens)

リクエストと応答スキーマの詳細について詳しくは、`Anthropic Messages APIドキュメント<https://docs.anthropic.com/en/api/messages>`_を参照してください。

制限事項¶

**Claudeモデルのみ。**OpenAI、Llama、Mistral、およびその他のモデルは、このエンドポイントを通じては利用できません。
**flex処理や優先層なし。**``service_tier``フィールドはサポートされていません。
**Bedrockベータヘッダーのみ。**Bedrock互換の``anthropic-beta``ヘッダー値のみがサポートされています。
エラーメッセージは、Anthropicではなく、Snowflakeによって生成されます。

ステータスコード¶

200 OK: リクエストは正常に完了しました。
400 invalid_request_error: リクエスト本文の形式が正しくないか、無効な値が含まれています。
400 unknown model model_name: 指定されたモデルは存在しないか、Claudeモデルではありません。
402 budget exceeded: モデルの消費予算を超えました。
403 Not Authorized: アカウントがRESTAPIに対して有効になっていないか、既定のロールが``snowflake.cortex_user``データベースロールを所有していません。
429 too many requests: 使用量のクォータを超えました。後でもう一度お試しください。
503 inference timed out: リクエストに時間がかかりすぎました。

レート制限¶

すべてのSnowflakeのお客様に高いパフォーマンスを保証するため、Cortex REST APIリクエストにはレート制限が適用されます。制限を超えるリクエストについては、HTTP 429の応答を受け取る場合があります。Snowflakeは場合によってはこれらの制限を調整することがあります。

以下の表のデフォルトの制限は、アカウントごとに適用され、モデルごとに独立して適用されます。`エクスポネンシャル・バックオフ<https://platform.openai.com/docs/guides/rate-limits#retrying-with-exponential-backoff>`_を使用して再試行することで、アプリケーションが429応答をグレースフルに処理するようにしてください。

この制限を増やす必要がある場合は、Snowflakeサポートまでご連絡ください。

Cortex RESTAPI レート制限¶
モデル	1分あたりの処理トークン数（TPM）毎分（TPM）	1分あたりのリクエスト数（RPM）分（RPM）	最大出力（トークン）
`claude-3-7-sonnet`	600,000	600	16,384
`claude-4-opus`	200,000	200	16,384
`claude-4-sonnet`	2,000,000	1,200	16,384
`claude-haiku-4-5`	5,000,000	10,000	16,384
`claude-opus-4-5`	2,000,000	10,000	16,384
`claude-opus-4-6`	3,000,000	10,000	16,384
`claude-opus-4-7`	3,000,000	10,000	16,384
`claude-sonnet-4-5`	5,000,000	10,000	16,384
`claude-sonnet-4-6`	6,000,000	10,000	16,384
`deepseek-r1`	100,000	100	16,384
`llama3.1-8b`	800,000	800	16,384
`llama3.1-70b`	400,000	400	16,384
`llama3.1-405b`	200,000	200	16,384
`mistral-7b`	400,000	400	16,384
`mistral-large2`	600,000	200	16,384
`openai-gpt-4.1`	2,000,000	600	16,384
`openai-gpt-5`	600,000	600	16,384
`openai-gpt-5-chat`	1,000,000	1,000	16,384
`openai-gpt-5-mini`	2,000,000	2,000	16,384
`openai-gpt-5-nano`	10,000,000	10,000	16,384
`openai-gpt-5.1`	600,000	3,000	16,384
`openai-gpt-5.2`	1,200,000	3,000	16,384

クロスリージョン推論でレート制限を上げる¶

If you've opted into Cross Cloud, AWS Global, or Azure Global cross-region inference (see cross-region inference) in your Snowflake account, the rate limits are higher for the following models:

Cortex RESTAPI クロスリージョン推論によるレート制限¶
モデル	1分あたりの処理トークン数（TPM）毎分（TPM）	1分あたりのリクエスト数（RPM）分（RPM）	最大出力（トークン）
`claude-haiku-4-5`	5,000,000	10,000	16,384
`claude-opus-4-5`	2,000,000	10,000	16,384
`claude-opus-4-6`	3,000,000	10,000	16,384
`claude-opus-4-7`	3,000,000	10,000	16,384
`claude-sonnet-4-5`	5,000,000	10,000	16,384
`claude-sonnet-4-6`	6,000,000	10,000	16,384
`openai-gpt-4.1`	1,000,000	1,000	16,384
`openai-gpt-5`	1,000,000	10,000	16,384
`openai-gpt-5.1`	1,000,000	10,000	16,384
`openai-gpt-5.2`	1,000,000	10,000	16,384

レート制限イベントのトラブルシューティング¶

TPM または RPM の制限により、429応答コードが生成されます。RESTAPI の使用量が1分あたりのリクエスト制限を下回っているにもかかわらず429レスポンスコードを受け取った場合は、トークンの使用率を再確認してください。

Cortex RESTAPI は、スライディングウィンドウカウンターパターンを使用してレート制限を実装します。カウンターは、Snowflakeのプライベートネットワーク内のSnowflake Cortexからのみアクセスできる、高可用性のRedisクラスターに保存されます。

スライディングウィンドウカウンターは、以前の時間ウィンドウでの API クライアントトラフィックが一様に分布しているものとみなします。トラフィックが急増する場合、この仮定はリクエストのレートを過剰に推定する可能性がありますが、ウィンドウが短い場合はすぐに回復します。過大評価の対象となり、制限を増やしたい場合は、Snowflakeサポートまでご連絡ください。

既知の問題¶

セッショントークンの有効期限¶

:doc:`/developer-guide/snowflake-rest-api/authentication`で定義されている3つの方法のいずれかで認証することをお勧めします。ただし、Snowflakeセッショントークンで認証することを選択した場合は、APIアクセスが中断されないように、トークンの更新を処理する必要があります。

セッショントークンは定期的に期限切れになります。期限切れのセッショントークンでリクエストが実行された場合、REST APIは、エラーコード``390112``を含む応答``200 OK``を返します。これが発生した場合、操作は実行されません。

この動作を処理するには、アプリケーションで次のようにする必要があります。

HTTPステータスコードが``200 OK``であって、各API応答にエラーコード``390112``がないかどうかを確認します。
エラーコード``390112``が検出された場合は、セッショントークンを更新し、リクエストを再試行します。

注釈

この動作は、Snowflakeセッショントークンを使用するアプリケーションにのみ影響します。キーペア認証、OAuth、または:ref:`プログラムによるアクセストークン（PATs）<label-sfrest_authenticating_pat>`を使用して認証する場合、このエラー処理を実装する必要はありません。

コストの考慮事項¶

Snowflake Cortex REST API リクエストでは、処理されたトークンの数に基づいてコンピューティングコストが発生します。各モデルの100万トークンあたりのコスト（ドル単位）については、`Snowflakeサービス利用表`_を参照してください。

トークンは、Snowflake Cortex LLM 関数で処理されるテキストの最小単位で、テキストの4文字にほぼ同じです。トークンに相当する生の入力または出力テキストは、モデルによって異なる場合があります。

入力トークンも出力トークンもコンピューティングコストがかかります。会話やチャットのユーザー体験を提供するためにAPIを使用する場合、以前のすべてのプロンプトと応答が、対応するコストを伴って、それぞれの新しい応答を生成するために処理されます。