- Categories:
System functions (Query Information) , Table functions
GET_QUERY_OPERATOR_STATS¶
Returns statistics about individual query operators within a query that has completed. You can run this function for any completed query that was executed in the past 14 days.
You can use this information to understand the structure of a query and identify query operators (e.g. the join operator) that cause performance problems.
For example, you can use this information to determine which operators are consuming the most resources. As another example, you can use this function to identify joins that have more output rows than input rows, which can be a sign of an “exploding” join (e.g. an unintended Cartesian product).
These statistics are also available in the query profile tab in Snowsight.
The GET_QUERY_OPERATOR_STATS()
function makes the same information available via a programmatic interface.
For more information about finding problematic query operators, see Common query problems identified by Query Profile.
Syntax¶
GET_QUERY_OPERATOR_STATS( <query_id> )
Arguments¶
query_id
The ID of a query. You can use:
A string literal (a string surrounded by single quotes).
A session variable containing a query ID.
The return value from a call to the LAST_QUERY_ID function.
Returns¶
The GET_QUERY_OPERATOR_STATS function is a table function. It returns rows with statistics about each query operator in the query. For more information, see the Usage notes and Output sections below.
Usage notes¶
This function only returns statistics on queries that have completed.
You must have OPERATE or MONITOR privileges on the warehouse where you ran the query.
This function provides detailed statistics about each query operator used in the specified query. The possible query operators include:
Aggregate: Groups inputs and computes aggregate functions.
CartesianJoin: A specialized type of join.
Delete: Removes a record from a table.
ExternalFunction: Represents processing by an external function.
ExternalScan: Represents access to data stored in stage objects.
Filter: Represents an operation that filters the rows.
Flatten: Processes VARIANT records, possibly flattening them on a specified path.
Generator: Generates records using the TABLE(GENERATOR(…)) construct.
GroupingSets: Represents constructs such as GROUPING SETS, ROLLUP, and CUBE.
Insert: Adds a record to a table either through an INSERT or COPY operation.
InternalObject: Represents access to an internal data object (e.g in an Information Schema or the result of a previous query).
Join: Combines two inputs on a given condition.
JoinFilter: Special filtering operation that removes tuples that can be identified as not possibly matching the condition of a Join further in the query plan.
Merge: Performs a MERGE operation on a table.
Pivot: Transforms unique values from a column into multiple columns and does any necessary aggregation.
Result: Returns the query result.
Sort: Orders input on a given expression.
SortWithLimit: Produces a part of the input sequence after sorting, typically a result of an
ORDER BY ... LIMIT ... OFFSET ...
construct.TableScan: Represents access to a single table.
UnionAll: Concatenates two inputs.
Unload: Represents a COPY operation that exports data from a table to a file in a stage.
Unpivot: Rotates a table by transforming columns into rows.
Update: Updates a record in a table.
ValuesClause: List of values provided with the VALUES clause.
WindowFunction: Computes window functions.
WithClause: Precedes the body of the SELECT statement, and defines one or more CTEs.
WithReference: Instance of a WITH clause.
The information is returned as a table. Each row in the table corresponds to one operator. The row contains the execution breakdown and the query statistics for that operator.
The row may also list operator attributes (these depend on the type of operator).
Statistics that break down query execution time are expressed as a percentage of the time consumed by the total query.
For more information about specific statistics, see Output (in this topic).
Because this function is a table function, you must use it in a FROM clause and you must wrap it in
TABLE()
. For example:select * from table(get_query_operator_stats(last_query_id()));
For each individual execution of a specific query (i.e. a specific UUID), this function is deterministic; it returns the same values each time.
However, for different executions of the same query text, this function can return different runtime statistics. The statistics depend on many factors. The following factors can have a major impact on the execution and therefore on the statistics returned by this function:
The volume of data.
The availability of materialized views, and the changes (if any) to the data since those materialized views were last refreshed.
The presence or absence of clustering.
The presence or absence of previously-cached data.
The size of the virtual warehouse.
The values can also be affected by factors outside the user’s query and data. These factors are usually small. The factors include:
Virtual warehouse initialization time.
Latency with external functions.
Output¶
The function returns the following columns:
Column Name |
Data Type |
Description |
---|---|---|
QUERY_ID |
VARCHAR |
The query ID, which is an internal, system-generated identifier for the SQL statement. |
STEP_ID |
NUMBER(38, 0) |
Identifier of the step in the query plan. |
OPERATOR_ID |
NUMBER(38, 0) |
The operator’s identifier. This is unique within the query. Values start at 0. |
PARENT_OPERATORS |
ARRAY containing one or more NUMBER(38, 0) |
Identifiers of the parent operators for this operator, or NULL if this is the final operator in the query plan (which is usually the Result operator). |
OPERATOR_TYPE |
VARCHAR |
The type of query operator (e.g. |
VARIANT containing an OBJECT |
Statistics about the operator (e.g. the number of output rows from the operator). |
|
VARIANT containing an OBJECT |
Information about the execution time of the operator. |
|
VARIANT containing an OBJECT |
Information about the operator. This information depends upon the operator type. |
If there is no information for the specific column for the operator, the value is NULL.
Three of these columns contain OBJECTs. Each object contains key/value pairs. The tables below show information about the keys in these tables.
OPERATOR_STATISTICS¶
The fields in the OBJECTs for the OPERATOR_STATISTICS
column provide additional information about the operator. The
information can include:
Key |
Nested Key (if applicable) |
Data Type |
Description |
---|---|---|---|
|
Statistics for Data Manipulation Language (DML) queries: |
||
|
DOUBLE |
The number of rows inserted into a table (or tables). |
|
|
DOUBLE |
The number of rows updated in a table. |
|
|
DOUBLE |
The number of rows deleted from a table. |
|
|
DOUBLE |
The number of rows unloaded during data export. |
|
|
Information about calls to external functions.
If the value of a field, for example |
||
|
DOUBLE |
The number of times that an external function was called. (This can be different from the number of external function calls in the text of the SQL statement due to the number of batches that rows are divided into, the number of retries (if there are transient network problems), etc.) |
|
|
DOUBLE |
The number of rows sent to external functions. |
|
|
DOUBLE |
The number of rows received back from external functions. |
|
|
DOUBLE |
The number of bytes sent to external functions. If the key includes |
|
|
DOUBLE |
The number of bytes received from external functions. If the key includes |
|
|
DOUBLE |
The number of retries due to transient errors. |
|
|
DOUBLE |
The average amount of time per invocation (call) in milliseconds between the time Snowflake sent the data and received the returned data. |
|
|
INTEGER |
Total number of HTTP requests that returned a 4xx status code. |
|
|
INTEGER |
Total number of HTTP requests that returned a 5xx status code. |
|
|
DOUBLE |
Average latency for successful HTTP requests. |
|
|
DOUBLE |
Average overhead per successful request due to a slowdown caused by throttling (HTTP 429). |
|
|
DOUBLE |
Number of batches that were retried due to HTTP 429 errors. |
|
|
DOUBLE |
50th percentile latency for successful HTTP requests. 50 percent of all successful requests took less than this time to complete. |
|
|
DOUBLE |
90th percentile latency for successful HTTP requests. 90 percent of all successful requests took less than this time to complete. |
|
|
DOUBLE |
95th percentile latency for successful HTTP requests. 95 percent of all successful requests took less than this time to complete. |
|
|
DOUBLE |
99th percentile latency for successful HTTP requests. 99 percent of all successful requests took less than this time to complete. |
|
|
INTEGER |
The number of input rows. This can be missing for an operator with no input edges from other operators. |
|
|
Information about the I/O (input/output) operations performed during the query. |
||
|
DOUBLE |
The percentage of data scanned for a given table so far. |
|
|
DOUBLE |
The number of bytes scanned so far. |
|
|
DOUBLE |
The percentage of data scanned from the local disk cache. |
|
|
DOUBLE |
Bytes written (e.g. when loading into a table). |
|
|
DOUBLE |
Bytes written to a result object. For example, In general, the results object represents whatever is produced as a result of the query, and |
|
|
DOUBLE |
Bytes read from a result object. |
|
|
DOUBLE |
Bytes read from an external object (e.g. a stage). |
|
|
|
DOUBLE |
The amount of data sent over the network. |
|
INTEGER |
The number of output rows. This can be missing for the operator that returns the results to the user (which is usually the RESULT operator). |
|
|
Information on table pruning. |
||
|
DOUBLE |
The number of partitions scanned so far. |
|
|
DOUBLE |
The total number of partitions in a given table. |
|
|
Information about disk usage for operations in which intermediate results do not fit in memory. |
||
|
DOUBLE |
The volume of data spilled to remote disk. |
|
|
DOUBLE |
The volume of data spilled to local disk. |
|
|
Information about calls to extension functions. If the value of a field is zero then the field is not displayed. |
||
|
DOUBLE |
amount of time for the Java UDF handler to load. |
|
|
DOUBLE |
number of times the Java UDF handler is invoked. |
|
|
DOUBLE |
maximum amount of time for the Java UDF handler to execute. |
|
|
DOUBLE |
average amount of time to execute the Java UDF handler. |
|
|
DOUBLE |
number of times the Java UDTF process method was invoked. |
|
|
DOUBLE |
amount of time to execute the Java UDTF process. |
|
|
DOUBLE |
average amount of time to execute the Java UDTF process. |
|
|
DOUBLE |
number of times the Java UDTF constructor was invoked. |
|
|
DOUBLE |
amount of time to execute the Java UDTF constructor. |
|
|
DOUBLE |
average amount of time to execute the Java UDTF constructor. |
|
|
DOUBLE |
number of times the Java UDTF endPartition method was invoked. |
|
|
DOUBLE |
amount of time to execute the Java UDTF endPartition method. |
|
|
DOUBLE |
average amount of time to execute the Java UDTF endPartition method. |
|
|
DOUBLE |
maximum amount of time to download the Java UDF dependencies. |
|
|
DOUBLE |
peak memory usage as reported by the JVM. |
|
|
DOUBLE |
compile time for the Java UDF inline code. |
|
|
DOUBLE |
number of times the Python UDF handler was invoked. |
|
|
DOUBLE |
total execution time for Python UDF handler. |
|
|
DOUBLE |
average amount of time to execute the Python UDF handler. |
|
|
DOUBLE |
peak memory usage by the Python sandbox environment. |
|
|
DOUBLE |
average amount of time to create the Python environment, including downloading and installing packages. |
|
|
DOUBLE |
amount of time to run the Conda solver to solve Python packages. |
|
|
DOUBLE |
amount of time to create the Python environment. |
|
|
DOUBLE |
amount of time to initialize the Python UDF. |
|
|
DOUBLE |
number of external file bytes read for UDFs. |
|
|
DOUBLE |
number of external files accessed for UDFs. |
EXECUTION_TIME_BREAKDOWN¶
The fields in the OBJECTs for the EXECUTION_TIME_BREAKDOWN
column are shown below.
Key |
Data Type |
Description |
---|---|---|
|
DOUBLE |
The percentage of the total query time spent by this operator. |
|
DOUBLE |
Time spent setting up query processing. |
|
DOUBLE |
Time spent processing the data by the CPU. |
|
DOUBLE |
Time spent synchronizing activities between participating processes. |
|
DOUBLE |
Time during which processing was blocked while waiting for local disk access. |
|
DOUBLE |
Time during which processing was blocked while waiting for remote disk access. |
|
DOUBLE |
Time during which processing was waiting for network data transfer. |
OPERATOR_ATTRIBUTES¶
Each output row describes one operator in the query. The table below shows the possible types of operators (e.g. the Filter operator). For each type of operator, the table shows the possible attributes (e.g. the expression used to filter the rows).
The operator attributes are stored in the OPERATOR_ATTRIBUTES
column, which is of type VARIANT and contains an
OBJECT. The OBJECT contains key/value pairs. Each key corresponds to one attribute of the operator.
Operator Name |
Key |
Data Type |
Description |
---|---|---|---|
|
|||
|
ARRAY of VARCHAR |
List of functions computed. |
|
|
ARRAY of VARCHAR |
The group-by expression. |
|
|
|||
|
VARCHAR |
Non-equality join expression. |
|
|
VARCHAR |
Equality join expression. |
|
|
VARCHAR |
Type of join (INNER). |
|
|
|
VARCHAR |
Name of updated table. |
|
|||
|
VARCHAR |
The name of the stage from which the data is read. |
|
|
VARCHAR |
The type of the stage. |
|
|
|
VARCHAR |
The expression used to filter data. |
|
|
VARCHAR |
The input expression used to flatten data. |
|
|||
|
NUMBER |
Value of the input parameter ROWCOUNT. |
|
|
NUMBER |
Value of the input parameter TIMELIMIT. |
|
|
|||
|
ARRAY of VARCHAR |
List of functions computed. |
|
|
ARRAY of VARCHAR |
List of grouping sets. |
|
|
|||
|
VARCHAR |
Which expressions are inserted. |
|
|
ARRAY of VARCHAR |
List of table names to which records are added. |
|
|
|
VARCHAR |
The name of the accessed object. |
|
|||
|
VARCHAR |
Non-equality join expression. |
|
|
VARCHAR |
Equality join expression. |
|
|
VARCHAR |
Type of join (INNER, OUTER, LEFT JOIN, etc.). |
|
|
|
NUMBER |
The operator id of the join used to identify tuples that can be filtered out. |
|
|
VARCHAR |
Name of updated table. |
|
|||
|
ARRAY of VARCHAR |
Remaining columns on which the results are aggregated. |
|
|
ARRAY of VARCHAR |
Resulting columns of pivot values. |
|
|
|
ARRAY of VARCHAR |
List of expressions produced. |
|
|
ARRAY of VARCHAR |
Expression defining the sorting order. |
|
|||
|
NUMBER |
Position in the ordered sequence from which produced tuples are emitted. |
|
|
NUMBER |
Number of rows produced. |
|
|
ARRAY of VARCHAR |
Expression defining the sorting order. |
|
|
|||
|
ARRAY of VARCHAR |
List of scanned columns. |
|
|
ARRAY of VARCHAR |
List of paths extracted from variant columns. |
|
|
VARCHAR |
Alias of table being accessed. |
|
|
VARCHAR |
Name of table being accessed. |
|
|
|
VARCHAR |
Stage where data is saved. |
|
|
ARRAY of VARCHAR |
Output columns of the unpivot query. |
|
|
VARCHAR |
Name of updated table. |
|
|||
|
NUMBER |
Number of produced values. |
|
|
VARCHAR |
List of values. |
|
|
|
ARRAY of VARCHAR |
List of functions computed. |
|
|
VARCHAR |
Alias of WITH clause. |
If an operator is not listed, no attributes are produced, and the value is reported as {}
.
Note
The following operators do not have any operator attributes and therefore are not included in the table of
OPERATOR_ATTRIBUTES
:UnionAll
ExternalFunction
Examples¶
Retrieving data about a single query¶
This example shows the statistics for a SELECT that joins two small tables.
Run the SELECT statement:
select x1.i, x2.i
from x1 inner join x2 on x2.i = x1.i
order by x1.i, x2.i;
Get the query ID:
set lqid = (select last_query_id());
Call GET_QUERY_OPERATOR_STATS() to get statistics about the individual query operators in the query:
select * from table(get_query_operator_stats($lqid));
+--------------------------------------+---------+-------------+--------------------+---------------+-----------------------------------------+-----------------------------------------------+----------------------------------------------------------------------+
| QUERY_ID | STEP_ID | OPERATOR_ID | PARENT_OPERATORS | OPERATOR_TYPE | OPERATOR_STATISTICS | EXECUTION_TIME_BREAKDOWN | OPERATOR_ATTRIBUTES |
|--------------------------------------+---------+-------------+--------------------+---------------+-----------------------------------------+-----------------------------------------------+----------------------------------------------------------------------|
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 0 | NULL | Result | { | { | { |
| | | | | | "input_rows": 64 | "overall_percentage": 0.000000000000000e+00 | "expressions": [ |
| | | | | | } | } | "X1.I", |
| | | | | | | | "X2.I" |
| | | | | | | | ] |
| | | | | | | | } |
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 1 | [ 0 ] | Sort | { | { | { |
| | | | | | "input_rows": 64, | "overall_percentage": 0.000000000000000e+00 | "sort_keys": [ |
| | | | | | "output_rows": 64 | } | "X1.I ASC NULLS LAST", |
| | | | | | } | | "X2.I ASC NULLS LAST" |
| | | | | | | | ] |
| | | | | | | | } |
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 2 | [ 1 ] | Join | { | { | { |
| | | | | | "input_rows": 128, | "overall_percentage": 0.000000000000000e+00 | "equality_join_condition": "(X2.I = X1.I)", |
| | | | | | "output_rows": 64 | } | "join_type": "INNER" |
| | | | | | } | | } |
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 3 | [ 2 ] | TableScan | { | { | { |
| | | | | | "io": { | "overall_percentage": 0.000000000000000e+00 | "columns": [ |
| | | | | | "bytes_scanned": 1024, | } | "I" |
| | | | | | "percentage_scanned_from_cache": 1, | | ], |
| | | | | | "scan_progress": 1 | | "table_name": "MY_DB.MY_SCHEMA.X2" |
| | | | | | }, | | } |
| | | | | | "output_rows": 64, | | |
| | | | | | "pruning": { | | |
| | | | | | "partitions_scanned": 1, | | |
| | | | | | "partitions_total": 1 | | |
| | | | | | } | | |
| | | | | | } | | |
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 4 | [ 2 ] | JoinFilter | { | { | { |
| | | | | | "input_rows": 64, | "overall_percentage": 0.000000000000000e+00 | "join_id": "2" |
| | | | | | "output_rows": 64 | } | } |
| | | | | | } | | |
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 5 | [ 4 ] | TableScan | { | { | { |
| | | | | | "io": { | "overall_percentage": 0.000000000000000e+00 | "columns": [ |
| | | | | | "bytes_scanned": 1024, | } | "I" |
| | | | | | "percentage_scanned_from_cache": 1, | | ], |
| | | | | | "scan_progress": 1 | | "table_name": "MY_DB.MY_SCHEMA.X1" |
| | | | | | }, | | } |
| | | | | | "output_rows": 64, | | |
| | | | | | "pruning": { | | |
| | | | | | "partitions_scanned": 1, | | |
| | | | | | "partitions_total": 1 | | |
| | | | | | } | | |
| | | | | | } | | |
+--------------------------------------+---------+-------------+--------------------+---------------+-----------------------------------------+-----------------------------------------------+----------------------------------------------------------------------+
Identifying “exploding” join operators¶
The following example shows how to use GET_QUERY_OPERATOR_STATS to examine a complicated query. This example looks for operators within a query that produce many more rows than were input to that operator.
This is the query to be analyzed:
select *
from t1
join t2 on t1.a = t2.a
join t3 on t1.b = t3.b
join t4 on t1.c = t4.c
;
Get the query ID of the previous query:
set lid = last_query_id();
The following query shows the ratio of output rows to input rows for each of the join operators in the query.
select
operator_id,
operator_attributes,
operator_statistics:output_rows / operator_statistics:input_rows as row_multiple
from table(get_query_operator_stats($lid))
where operator_type = 'Join'
order by step_id, operator_id;
+---------+-------------+--------------------------------------------------------------------------+---------------+
| STEP_ID | OPERATOR_ID | OPERATOR_ATTRIBUTES | ROW_MULTIPLE |
+---------+-------------+--------------------------------------------------------------------------+---------------+
| 1 | 1 | { "equality_join_condition": "(T4.C = T1.C)", "join_type": "INNER" } | 49.969249692 |
| 1 | 3 | { "equality_join_condition": "(T3.B = T1.B)", "join_type": "INNER" } | 116.071428571 |
| 1 | 5 | { "equality_join_condition": "(T2.A = T1.A)", "join_type": "INNER" } | 12.20657277 |
+---------+-------------+--------------------------------------------------------------------------+---------------+
After you identify the exploding joins, you can review each join condition to verify that the condition is correct.