- Categories:
System functions (Control)
EXECUTE_AI_EVALUATION¶
Start or get the status of a Cortex Agent evaluation run.
For more information on Cortex Agent evaluations, see Cortex Agent evaluations.
- See also:
GET_AI_RECORD_TRACE (SNOWFLAKE.LOCAL) , GET_AI_EVALUATION_DATA (SNOWFLAKE.LOCAL) , GET_AI_OBSERVABILITY_LOGS (SNOWFLAKE.LOCAL)
Syntax¶
Arguments¶
evaluation_jobOne of the following values:
‘START’: Starts an evaluation
‘STATUS’: Retrieves the status of an evaluation
run_parametersA SQL OBJECT value that contains the following key:
run_name: The name of the run to perform theevaluation_joboperation on.
config_file_pathA stage file path pointing to an agent evaluation configuration. This path can’t be a signed URL. For the full configuration YAML specification, see Agent Evaluation YAML specification.
Returns¶
The return value of this function depends on the evaluation_job:
‘START’ returns a single string message, indicating whether the SQL execution succeeded or failed.
‘STATUS’ returns a table containing information on the current state of the evaluation run.
The table returned by the ‘STATUS’ evaluation job has the following columns:
Name |
Type |
Description |
|---|---|---|
RUN_NAME |
VARCHAR |
The name of the evaluation run. |
AGENT_NAME |
VARCHAR |
The (unqualified) name of the agent being evaluated. |
AGENT_TYPE |
VARCHAR |
The type of agent being evaluated. |
STATUS |
VARCHAR |
The current status of the evaluation run. |
STATUS_DETAILS |
ARRAY |
An array of error messages that occured during this run. |
Values in the STATUS column are one of:
Status |
Description |
|---|---|
CREATED |
The run has been created but not started. |
INVOCATION_IN_PROGRESS |
The run invocation is in the process of generating the output and the traces. |
INVOCATION_COMPLETED |
The run invocation completed with all outputs and traces created. |
INVOCATION_PARTIALLY_COMPLETED |
The run invocation is partially completed due to failures in application invocation and trace generation. |
COMPUTATION_IN_PROGRESS |
The metric computation is in progress. |
COMPLETED |
The metric computation is completed with detailed outputs and traces. |
PARTIALLY_COMPLETED |
The run is partially completed due to failures during the metric computation. |
CANCELLED |
The run has been cancelled. |
Access control requirements¶
For the full access control requirements to conduct a Cortex Agent evaluation, see Cortex Agent evaluatons – Access control requirements.
Examples¶
The following example starts a run called run-1 using the agent evaluation configuration from @eval_db.eval_schema.metrics/agent_evaluation_config.yaml:
The following example queries the status of the evaluation run run-1 using the agent configuration from @eval_db.eval_schema.metrics/agent_evaluation_config.yaml: