Cortex Agents¶
Cortex Agents orchestrate across both structured and unstructured data sources to deliver insights. They plan tasks, use tools to execute these tasks, and generate responses. Agents use Cortex Analyst (structured) and Cortex Search (unstructured) as tools, along with LLMs, to analyze data. Cortex Search extracts insights from unstructured sources, while Cortex Analyst generates SQL to process structured data. A comprehensive support for tool identification and tool execution enables delivery of sophisticated applications grounded in enterprise data.
The workflow involves four key components:
Planning: Applications often switch between processing data from structured and unstructured sources. For example, consider a conversational app designed to answer user queries. A business user may first ask for top distributors by revenue (structured) and then switch to inquiring about a contract (unstructured). Cortex Agents can parse a request to orchestrate a plan and arrive at the solution or response.
Explore options: When the user poses an ambiguous question (for example, “Tell me about Acme Supplies”), the agent considers different permutations - products, location, or sales personnel - to disambiguate and improve accuracy.
Split into subtasks: Cortex Agents can split a task or request (for example, “What are the differences between contract terms for Acme Supplies and Acme Stationery?”) into multiple parts for a more precise response.
Route across tools: The agent selects the right tool - Cortex Analyst or Cortex Search - to ensure governed access and compliance with enterprise policies.
Tool use: With a plan in place, the agent retrieves data efficiently. Cortex Search extracts insights from unstructured sources, while Cortex Analyst generates SQL to process structured data. A comprehensive support for tool identification and tool execution enables delivery of sophisticated applications grounded in enterprise data.
Reflection: After each tool use, the agent evaluates results to determine the next steps - asking for clarification, iterating, or generating a final response. This orchestration allows it to handle complex data queries while ensuring accuracy and compliance within Snowflake’s secure perimeter.
Monitor and iterate: After deployment, customers can track metrics, analyze performance and refine behavior for continuous improvements. On the client application developers can use TruLens to monitor the Agent interaction. By continuously monitoring and refining governance controls, enterprises can confidently scale AI agents while maintaining security and compliance.
Note
While Snowflake strives to provide high quality responses, the accuracy of the LLM responses or the citations provided are not guaranteed. You should review all answers from the Agents API before serving them to your users.
Access control requirements¶
The querying user must have:
USAGE on the Cortex Search Service referenced in the query.
USAGE on the database, schema, and tables referenced in the Cortex Analyst semantic model
A role with the CORTEX_USER database role granted. For more information, see Required privileges.
How to use the Agent API¶
This section shows the steps to create an agent using the Agent API.
Configure Key-Pair Authentication¶
The Snowflake REST APIs currently support authentication only via JWT tokens. To query the Agents API REST endpoint using a JWT token, you must first set up key pair authentication. See the Using key pair authentication for instructions. Then you must generate a JWT token using your key-pair authentication setup. One way to generate a token is to use snowSQL.
Create a Cortex Analyst Service¶
If you plan to use Analyst as a tool, you may create a Cortex Analyst Service. For an example of creating a service, see the example in Create a semantic model.
Create a Cortex Search Service¶
If you plan to use Search as a tool, you may create a Cortex Analyst semantic model. For an example of creating a semantic model, see the example in CREATE CORTEX SEARCH SERVICE.
Note
The DEFAULT_ROLE of the querying user must have USAGE privilege on the Cortex Search Service, as well as the database and schema in which it resides.
Calling the API¶
First, locate your Snowflake account URL. Once you have your URL and your JWT token, you can query the Agents API from the command line using cURL with the following syntax:
curl -X POST "$SNOWFLAKE_ACCOUNT_BASE_URL/api/v2/cortex/agent:run" \
--header 'X-Snowflake-Authorization-Token-Type: KEYPAIR_JWT' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header "Authorization: Bearer $YOUR_JWT" \
--data '{
"model": "mistral-large2",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "what are the top 3 customers by revenue"
}
]
}
],
"tools": [
{
"tool_spec": {
"type": "cortex_search",
"name": "search1"
},
{
"tool_spec": {
"type": "cortex_analyst_text_to_sql",
"name": "analyst1"
}
}
}
],
"tool_resources": {
"search1": {"name": "testdb.testschema.transcript_search_service"}
},
"analyst1":
{"semantic_model_file": "@testdb.testschema.stage/sample_data_model.yaml"}
}'
The response is streamed incrementally to the client.
Supported models¶
You can use the following models with Cortex Agents to generate the response. Please note that the model is not used for orchestration.
llama3.1-70b
llama3.3-70b
mistral-large2
claude-3-5-sonnet
You can also specify a response instruction to customize the agent’s responses.
{
"response-instruction": "You will always maintain a friendly tone and provide concise response"
}
Important
Cortex Agents uses models that might not be available in all regions. For more information, see Availability.
Availability¶
The Cortex Agents capability is available in the following regions:
AWS US West 2
(Oregon)
|
AWS US East 1
(N. Virginia)
|
AWS Europe Central 1
(Frankfurt)
|
AWS Europe West 1
(Ireland)
|
AWS AP Southeast 2
(Sydney)
|
AWS AP Northeast 1
(Tokyo)
|
Azure East US 2
(Virginia)
|
Azure West Europe
(Netherlands)
|
---|---|---|---|---|---|---|---|
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
Cost considerations¶
In preview, the Cortex Analyst and Cortex Search services incur costs per the details listed in the Snowflake Service Consumption Table.
Build a Cortex Agent¶
We will use Cortex Agents to enable a conversational application that answers questions from business users regarding contract terms. Let us review the main components. (For the complete tutorial, see Getting Started with Cortex Agents).
Step 1. Make the tools available as resources to the Cortex Agent
{
"tool_resources": {
"Analyst1": {
"semantic_model_file": "@cortex_tutorial_db.public.revenue_semantic_model.yaml"
},
"Search1": {
"name": "cortex_tutorial_db.public.contract_terms"
}
}
}
Step 2. Specify the tools you want to use in the request
{
"tools": [
{
"tool_spec": {
"name": "Analyst1",
"type": "cortex_analyst_text_to_sql"
}
},
{
"tool_spec": {
"name": "Search1",
"type": "cortex_search"
}
}
]
}
Step 3. Now we will specify the model and the system prompt to generate the response
{
"model": "claude-3-5-sonnet",
"messages": [
{
"role": "system",
"content": {
"type": "text",
"text": "You’re a friendly assistant to answer questions."
}
}
]
}
Step 4. Create a semantic model file that will be used by the Analyst tool to access structured data
Follow steps 1 to 3 in this guide to create a Cortex Analyst semantic model Getting Started with Cortex Agents
Step 5. Next, we set up search service for Search tool to access unstructured data
Follow step 4 to 5 in this guide to create Cortex Search Service Getting Started with Cortex Agents
Step 6. We are now ready to interact with the Agent. You will use the messages field to send requests and receive responses
{
"model": "claude-3-5-sonnet",
"messages": [
{
"role": "system",
"content": [
{
"type": "text",
"text": "You’re a friendly assistant to answer questions"
}
]
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "hello"
}
]
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "hi there!"
}
]
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "..."
}
]
}
]
}
Step 7. As the interaction proceeds, Agent identifies the tools and executes (service-side) to fulfill the task. In the example below, the Agent identifies Text2SQL as the tool and executes to get the SQL query
{
"role": "assistant",
"content": [
{
"type": "tool_use",
"tool_use": {
"tool_use_id": "tool_001",
"name": "cortex_analyst_text_to_sql",
"input": {
"query": "...",
"semantic_model_file": "..."
}
}
},
{
"type": "tool_results",
"tool_results": {
"status": "success",
"tool_use_id": "tool_001",
"content": [
{
"type": "json",
"json": {
"sql": "select * from table"
}
}
]
}
}
]
}
Step 8. During the interaction the Agent may request a tool use for the client application (client-side). For example, the Agent specifies the SQL query that should be executed.
{
"role": "assistant",
"content": [
{
"type": "tool_use",
"tool_use": {
"tool_use_id": "tool_002",
"name": "cortex_analyst_sql_exec",
"input": {
"sql": "select * from table"
}
}
}
]
}
Client executes the query and surfaces the results back to the Agent:
{
"role": "user",
"content": [
{
"type": "tool_results",
"tool_results": {
"status": "success",
"tool_use_id": "tool_002",
"content": [
{
"type": "json",
"json": {
"Query_id": "01ba1888-..",
"warehouse": "WAREHOUSE_USED_FOR_SQL_EXEC"
}
}
]
}
}
]
}
Step 9. The agent generates the final response using the LLM and the response instruction.
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "The top three vendors by revenue are Acme .."
}
]
}