Identifying queries that can benefit from search optimization¶
Search optimization can improve the performance of many queries. This topic describes characteristics of the kinds of queries that search optimization helps the most with, and conversely, the kinds of queries that do not benefit.
General query characteristics¶
Search optimization works best to improve the performance of queries with the following characteristics:
The query involves a column or columns other than the primary cluster key.
The query typically runs for a few seconds or longer (before applying search optimization). In most cases, search optimization will not substantially improve the performance of a query that has a sub-second execution time.
At least one of the columns accessed by the query filter operation has on the order of 100,000 distinct values or more.
To determine the number of distinct values, you can use either of the following:
Use
APPROX_COUNT_DISTINCT
to get the approximate number of distinct values:SELECT APPROX_COUNT_DISTINCT(column1) FROM table1;
Use
COUNT(DISTINCT <col_name>)
to get the actual number of distinct values:SELECT COUNT(DISTINCT c1), COUNT(DISTINCT c2) FROM test_table;
Because you need only an approximation of the number of distinct values, consider using
APPROX_COUNT_DISTINCT
, which is generally faster and cheaper thanCOUNT(DISTINCT <col_name>)
.
Supported data types¶
The search optimization service currently supports the following data types:
Fixed-point numbers (for example, INTEGER, NUMERIC)
DATE, TIME, and TIMESTAMP
VARCHAR
BINARY
VARIANT, OBJECT, and ARRAY
GEOGRAPHY
Queries that involve other types of values (i.e., FLOAT, GEOMETRY) do not benefit.
Supported predicate types¶
Search optimization can improve the performance of queries using these kinds of predicates:
Other potential improvements¶
Search optimization can also improve the performance of views and of queries that use JOIN.
Views¶
The search optimization service can indirectly improve the performance of views (including secure views). If a base table for a view has search optimization enabled, and the query uses a selective predicate for that table, the search optimization service can improve performance when filtering rows. See Supported predicate types.
Not all tables in the view need to have search optimization enabled. Search optimization is performed on each table independently.
Joins¶
The search optimization service does not directly improve the performance of joins. However, it can improve the performance of filtering rows from either table prior to the join. This improvement may occur when the table has search optimization enabled and the predicate is selective, as discussed under Supported predicate types.
Both tables do not need to have search optimization enabled. The decision to use search optimization is made for each table independently.
Queries that do not benefit from search optimization¶
Currently, the search optimization service does not support floating point data types, GEOMETRY, or other data types not already discussed. Snowflake might add support for more data types in the future.
Additionally, the search optimization service does not support the following:
External tables.
Materialized views.
Columns defined with a COLLATE clause.
Column concatenation.
Analytical expressions.
Casts on table columns (except for fixed-point numbers cast to strings).
Although search optimization supports predicates with implicit and explicit casts on constant values, it does not support predicates that cast values in the actual table column (except for casts from INTEGER and NUMBER to VARCHAR).
For example, the following predicates are supported because they use implicit and explicit casts on constant values (not values in the table column):
-- Supported predicate -- (where the string '2020-01-01' is implicitly cast to a date) WHERE timestamp1 = '2020-01-01'; -- Supported predicate -- (where the string '2020-01-01' is explicitly cast to a date) WHERE timestamp1 = '2020-01-01'::date;
The following predicate is not supported because it uses a cast on values in the table column:
-- Unsupported predicate -- (where values in a VARCHAR column are cast to DATE) WHERE to_date(varchar_column) = '2020-01-01';
The search optimization service considers the original column values, not the values after the cast. As a result, the search optimization service is not used for queries with these predicates.
As noted, the exception to this rule is casting NUMBER or INTEGER values to VARCHAR values in the table column. The search optimization service does support this type of predicate:
-- Supported predicate -- (where values in a numeric column are cast to a string) WHERE cast(numeric_column as varchar) = '2'
Search optimization does not improve performance of queries that use Time Travel because search optimization works only on active data.