Schemas:

ACCOUNT_USAGE

AUTOMATIC_CLUSTERING_BENEFITS View

This Account Usage view can be used to determine the efficacy of pruning due to Automatic Clustering. You can use this view to compare the effects on pruning before and after enabling Automatic Clustering for a table.

To determine the efficacy of pruning due to Automatic Clustering, you can compare the number of partitions pruned due to Automatic Clustering (PARTITIONS_PRUNED_ADDITIONAL) against the total number of partitions pruned (PARTITIONS_PRUNED_DEFAULT + PARTITIONS_PRUNED_ADDITIONAL).

Columns

Column Name

Data Type

Description

START_TIME

TIMESTAMP_LTZ

Start of the time range (on the hour mark) during which the queries were executed.

END_TIME

TIMESTAMP_LTZ

End of the time range (on the hour mark) during which the queries were executed.

TABLE_ID

NUMBER

Internal/system-generated identifier for the table that was queried.

TABLE_NAME

TEXT

Name of the table that was queried.

SCHEMA_ID

NUMBER

Internal/system-generated identifier for the schema that contains the table that was queried.

SCHEMA_NAME

TEXT

Name of the schema that contains the table that was queried.

DATABASE_ID

NUMBER

Internal/system-generated identifier for the database that contains the table that was queried.

DATABASE_NAME

TEXT

Name of the database that contains the table that was queried.

NUM_SCANS

NUMBER

Number of scan operations (from all queries on the table during the START_TIME and END_TIME window) that leveraged clustering keys to prune on predicates. Note that a given query might result in multiple scan operations on the same table.

PARTITIONS_SCANNED

NUMBER

Number of partitions scanned during the scan operations described in NUM_SCANS.

PARTITIONS_PRUNED_DEFAULT

NUMBER

Estimated number of partitions that were pruned as a result of the default (natural) ordering of data during the scan operations described in NUM_SCANS.

PARTITIONS_PRUNED_ADDITIONAL

NUMBER

Estimated number of partitions that were pruned as a result of Automatic Clustering during the scan operations described in NUM_SCANS.

ROWS_SCANNED

NUMBER

Number of rows scanned during the scan operations described in NUM_SCANS.

ROWS_PRUNED_DEFAULT

NUMBER

Estimated number of rows that were pruned as a result of the default (natural) ordering of data during the scan operations described in NUM_SCANS.

ROWS_PRUNED_ADDITIONAL

NUMBER

Estimated number of rows that were pruned as a result of Automatic Clustering during the scan operations described in NUM_SCANS.

Usage Notes

  • Latency for the view may be up to 6 hours.

  • PARTITIONS_PRUNED_DEFAULT, PARTITIONS_PRUNED_ADDITIONAL, ROWS_PRUNED_DEFAULT, and ROWS_PRUNED_ADDITIONAL are estimated based on the partitions that have not been clustered.

    These amounts can change over time, due to:

    • Changes in the magnitude and effects of DML operations, which you can observe by using the TABLE_DML_HISTORY View.

    • The default (natural) ordering of data in any newly created partitions.

  • The sum of PARTITIONS_PRUNED_DEFAULT and PARTITIONS_PRUNED_ADDITIONAL is the actual number (not an estimated number) of partitions pruned during the scan operations described in NUM_SCANS.

    Similarly, the sum of ROWS_PRUNED_DEFAULT and ROWS_PRUNED_ADDITIONAL is the actual number (not an estimated number) of rows pruned during the scan operations described in NUM_SCANS.

Examples

List the tables that have benefited the most from Automatic Clustering within the last 24 hours:

SELECT * FROM snowflake.account_usage.automatic_clustering_benefits
WHERE start_time > DATEADD(hour, -24, CURRENT_TIMESTAMP())
ORDER BY partitions_pruned_additional / GREATEST(partitions_pruned_default + partitions_pruned_additional, 1) DESC,
partitions_pruned_additional DESC;
Copy

The example above uses GREATEST to avoid dividing by zero when the number of partitions pruned is zero.