Speeding up join queries with search optimization

The search optimization service can improve the performance of join queries that have a small number of distinct values on the build side of the join.

For example, the search optimization service can improve the performance of these types of joins:

  • Suppose that products is a table containing a row for each product, and sales is a table containing a row for each sale of a product. The products table contains fewer rows and is smaller than the sales table. To find all sales of a specific product, you join the sales table (the larger table) with the products table (the smaller table). Because the products table is small, there are few distinct values on the build side of the join.

    Note

    In data warehousing, the large table is often referred to as the fact table. The small table is referred to as the dimension table. The rest of this topic uses these terms when referring to the large table and the small table in a join.

  • Suppose that customers is a table containing a row for each customer, and sales is a table containing a row for each sale. Both tables are large. To find all sales for a specific customer, you join the sales table (the probe side) with the customers table (the build side) and use a filter so that there are a small number of distinct values on the build side of the join.

Enabling the search optimization service to improve the performance of joins

To enable the search optimization service to improve the performance of joins, add search optimization to the table on the probe side of the join. This table is usually a large table that isn’t filtered in join queries, such as a fact table.

To take advantage of search optimization, make sure the build side of the join has a small number of distinct values, either because it’s a small dimension table or because of a selective filter. The search optimization costs of a query are proportionate to the number of distinct values that must be looked up on the build side of the join. If this number is too large, Snowflake might decide against using the search access path and use the regular table access path instead.

Supported join predicates

The search optimization service can improve the performance of queries with the following types of join predicates:

  • Equality predicates of the form probe_side_table.column = build_side_table.column.

  • Transformations on the build-side operand of the predicate (for example, string concatenation, addition, and so on).

  • Conjunctions (AND) of multiple equality predicates.

Examples of supported queries

This section shows examples of join queries that can benefit from search optimization.

Example: Simple equality predicate

The following is an example of a supported query that uses a simple equality predicate as the join predicate. This query joins a table named sales with a table named customers. The probe-side table sales is large and has search optimization enabled. The build-side table customers is also large, but the input from this table is small, due to the selective filter on the customer_id column.

SELECT sales.date, customer.name
  FROM sales JOIN customers ON (sales.customer_id = customers.customer_id)
  WHERE customers.customer_id = 2094;
Copy

Example: Predicate transformed on the dimension-side operand

The following query joins a fact table named sales with a dimension table named products. The fact table is large and has search optimization enabled. The dimension table is small.

This query transforms the dimension-side operand of the predicate (for example, by multiplying values in the join condition) and can benefit from search optimization:

SELECT sales.date, product.name
  FROM sales JOIN products ON (sales.product_id = product.old_id * 100)
  WHERE product.category = 'Cutlery';
Copy

Example: Predicate spanning multiple columns

Queries in which a join predicate spans multiple columns can benefit from search optimization:

SELECT sales.date, product.name
  FROM sales JOIN products ON (sales.product_id = product.id and sales.location = product.place_of_production)
  WHERE product.category = 'Cutlery';
Copy

Example: Query using point-lookup filters and join predicates

In a query that uses both regular point-lookup filters and join predicates, the search optimization service can improve the performance of both. In the following query, the search optimization service can improve the sales.location point-lookup predicate as well as the product_id join predicate:

SELECT sales.date, product.name
  FROM sales JOIN products ON (sales.product_id = product.id)
  WHERE product.category = 'Cutlery'
  AND sales.location = 'Buenos Aires';
Copy

Limitations

The following limitations apply to the search optimization service and join queries:

  • Disjuncts (OR) in join predicates currently aren’t supported.

  • LIKE, ILIKE, and RLIKE join predicates currently aren’t supported.

  • Join predicates on VARIANT columns currently aren’t supported.

  • EQUAL_NULL equality predicates currently aren’t supported.

  • The current limitations of the search optimization service also apply to join queries.