Categories:

Table functions (Cortex Agents)

GET_AI_EVALUATION_DATA (SNOWFLAKE.LOCAL)

Retrieves evaluation data for a run for a Cortex Agent or for an External Agent application (see External Agent commands).

Call this function to inspect all recorded traces for an evaluation run. For more information on Cortex Agent evaluations, see Cortex Agent evaluations. For AI Observability applications, see Observability data.

See also:

EXECUTE_AI_EVALUATION , GET_AI_RECORD_TRACE (SNOWFLAKE.LOCAL) , GET_AI_OBSERVABILITY_LOGS (SNOWFLAKE.LOCAL) , GET_AI_OBSERVABILITY_EVENTS (SNOWFLAKE.LOCAL)

Syntax

SNOWFLAKE.LOCAL.GET_AI_EVALUATION_DATA( <database> , <schema> , <agent_name> , <agent_type>, <run_name> )

Arguments

database

Name of the database containing the agent.

schema

Name of the schema containing the agent.

agent_name

Name of the agent to retrieve a record for.

agent_type

The agent type string. Use CORTEX AGENT for a Cortex Agent or EXTERNAL AGENT for an External Agent object. This value is case-insensitive.

run_name

Name of the run to retrieve full evaluation data for.

Returns

A table containing information for the specified evaluation, with the following columns:

ColumnData typeDescription
RECORD_IDVARCHARThe unique identifier assigned by Snowflake for this evaluation record.
INPUT_IDVARCHARThe unique identifier assigned by Snowflake for this evaluation input.
REQUEST_IDVARCHARThe unique identifier assigned by Snowflake for this request.
TIMESTAMPTIMESTAMP_TZThe time (in UTC) at which the request was made.
DURATION_MSINTThe amount of time, in milliseconds, that it took for the agent to return a response.
INPUTVARCHARThe query string used as input for this evaluation record.
OUTPUTVARCHARThe response returned by the Cortex Agent for this evaluation record.
ERRORVARCHARInformation about any errors that occurred during the request.
GROUND_TRUTHVARCHARThe ground truth information used to evaluate this record’s Cortex Agent output. This column holds the JSON from your dataset’s ground truth column, serialized as a string. For how {{ground_truth}} in custom metrics relates to this value, see the notes under Evaluation results table format.
METRIC_NAMEVARCHARThe name of the metric evaluated for this record.
EVAL_AGG_SCORENUMBERThe evaluation score assigned for this record.
METRIC_TYPEVARCHARThe type of metric being evaluated. For built-in metrics, the value is system. For custom metrics, the value is custom.
METRIC_STATUSVARIANT

A map containing information about the agent’s HTTP response for this record, with the following keys:

  • status: The HTTP status code of the response.
  • message: The HTTP message sent in the status response.
METRIC_CALLSARRAY

An array of VARIANT values that contain information about the computed metric. Each array entry contains the metric’s criteria, an explanation of the metric score, and metadata. The keys of each entry are:

  • criteria: The criteria used by an LLM judge to evaluate response correctness.
  • explanation: An explanation of why the score was assigned.
  • full_metadata: A VARIANT value that contains metadata and information about this metric’s processing by the LLM judge. The keys of this map include:
    • completion_tokens: The number of output tokens generated by the LLM for this metric evaluation call.
    • normalized_score: The original evaluation score normalized to the range [0.0, 1.0], rounded to two decimal places.
    • original_score: The original score assigned by this metric evaluation for the record.
    • prompt_tokens: The number of tokens taken up by the prompt provided to the LLM judge.
    • total_tokens: The total number of tokens used by the LLM judge for this computation.
TOTAL_INPUT_TOKENSINTThe total number of tokens used to process the input query.
TOTAL_OUTPUT_TOKENSINTThe total number of output tokens produced by the Cortex Agent.
LLM_CALL_COUNTINTCounts the number of times any LLM was called, either by the agent or an evaluation judge.

Access control requirements

A role used to execute this operation must have the following privileges at a minimum:

PrivilegeObjectNotes
CORTEX_USERDatabase role
USAGECortex Agent or External AgentRequired on the object identified by agent_name. For EXTERNAL AGENT, USAGE on the External Agent is sufficient to call this function (MONITOR does not apply).
MONITORCortex AgentRequired on the Cortex Agent identified by agent_name when agent_type is CORTEX AGENT. Does not apply when agent_type is EXTERNAL AGENT.

Operating on an object in a schema requires at least one privilege on the parent database and at least one privilege on the parent schema.

For instructions on creating a custom role with a specified set of privileges, see Creating custom roles.

For general information about roles and privilege grants for performing SQL actions on securable objects, see Overview of Access Control.

When agent_type is EXTERNAL AGENT, only USAGE on that object is required to call this function. OWNERSHIP on the External Agent is required to modify or remove the object with ALTER EXTERNAL AGENT or DROP EXTERNAL AGENT.

For the full access control permissions required by Cortex Agent evaluations, see Cortex Agent evaluations – Access control requirements. For External Agent objects, see Observability data.

Examples

The following example displays the full evaluation details for a run called run-1, where the agent is named evaluated_agent stored on the schema eval_db.eval_schema:

SELECT * FROM TABLE(SNOWFLAKE.LOCAL.GET_AI_EVALUATION_DATA(
  'eval_db',
  'eval_schema',
  'evaluated_agent',
  'CORTEX AGENT',
  'run-1')
);