Understanding How Snowflake Can Eliminate Redundant Joins¶
In some cases, a join on a key column can refer to tables that are not needed for the join. If your tables have key columns and you are using and enforcing the UNIQUE, PRIMARY KEY, and FOREIGN KEY constraints, Snowflake can improve query performance by eliminating unnecessary joins on key columns.
These optimizations are performed only if you use the RELY constraint property to indicate that the data in your tables complies with the constraints around primary keys and foreign keys.
Setting the RELY Constraint Property to Eliminate Unnecessary Joins¶
Snowflake only performs this optimization on joins if you indicate that the data in your tables comply with the UNIQUE, PRIMARY KEY, and FOREIGN KEY constraints.
As mentioned in Supported Constraint Types, Snowflake does not enforce UNIQUE, PRIMARY KEY, and FOREIGN KEY constraints on standard tables, but does enforce them on hybrid tables. For standard tables, you are responsible for enforcing constraints on the data.
If you have ensured that the data complies with these constraints and you want Snowflake to eliminate unnecessary joins, set the RELY constraint property on the UNIQUE, PRIMARY KEY, FOREIGN KEY constraints.
Note
You are responsible for maintaining the integrity of your constraints (UNIQUE, PRIMARY KEY, and FOREIGN KEY). If the integrity of your constraints is not maintained, the query results might differ if the RELY constraint property is set (compared to the results with NORELY).
Examples of Eliminating Unnecessary Joins¶
The following examples demonstrates cases in which Snowflake eliminates joins and references to tables that are not necessary:
In these examples:
dim_products
is a table that contains a row for each product available for purchase.In this table,
product_id
is a column that uniquely identifies a product.fact_sales
is a table that contains a row for each sale of a product.In this table,
product_id
is a column that identifies the product that was sold. The IDs in this column correspond to the IDs in theproduct_id
column of thedim_products
table.
Example 1: Eliminating an Unnecessary Left Outer Join¶
This following is an example of an unnecessary left outer join that Snowflake can optimize:
SELECT f.*
FROM fact_sales f
LEFT OUTER JOIN dim_products p
ON f.product_id = p.product_id;
The join is unnecessary because the statement does not refer to any columns in the dim_products
table on the right (other than
the primary key column for the join).
If the dim_products.product_id
column has the UNIQUE or PRIMARY KEY constraint with the RELY property, Snowflake can identify
this join as unnecessary and can eliminate the reference to the dim_products
table on the right.
Example 2: Eliminating an Unnecessary Self-Join¶
This following is an example of an unnecessary self-join that Snowflake can optimize:
SELECT p1.product_id, p2.product_name
FROM dim_products p1, dim_products p2
WHERE p1.product_id = p2.product_id;
The statement unnecessarily joins the dim_products
table with itself and selects columns from that table.
If the dim_products.product_id
column has the UNIQUE or PRIMARY KEY constraint with the RELY property, Snowflake can identify
this join as unnecessary and can eliminate the reference to the dim_products
table on the right.
Example 3: Eliminating an Unnecessary Join on a Primary Key and Foreign Key¶
This following is an example of an unnecessary inner join that Snowflake can optimize.
SELECT p.product_id, f.units_sold
FROM fact_sales f, dim_products p
WHERE f.product_id = p.product_id;
The statement does not refer to any columns in the dim_products
table on the right, other than the primary key column for the
join.
If the dim_products.product_id
column has the PRIMARY KEY constraint and the fact_sales.product_id
column has the FOREIGN
KEY constraint, Snowflake can identify this join as unnecessary and can eliminate the reference to the dim_products
table on
the right.