- Categories:
JOIN¶
A JOIN
operation combines rows from two tables (or other table-like sources, such as
views or table functions) to create a new combined row that can be used in the query.
For a conceptual explanation of joins, see Working with Joins.
This topic describes how to use the JOIN
construct in the FROM clause.
The JOIN
subclause specifies (explicitly or implicitly) how to relate rows
in one table to the corresponding rows in the other table. See also ASOF JOIN,
which is used to join time-series data on timestamp columns when their values closely follow each other,
precede each other, or match exactly.
Although the recommended way to join tables is to use JOIN
with the ON
subclause of the FROM
clause,
an alternative way to join tables is to use the WHERE
clause. For details, see the documentation for the
WHERE clause.
Syntax¶
Use one of the following:
SELECT ...
FROM <object_ref1> [
{
INNER
| { LEFT | RIGHT | FULL } [ OUTER ]
}
]
JOIN <object_ref2>
[ ON <condition> ]
[ ... ]
SELECT *
FROM <object_ref1> [
{
INNER
| { LEFT | RIGHT | FULL } [ OUTER ]
}
]
JOIN <object_ref2>
[ USING( <column_list> ) ]
[ ... ]
SELECT ...
FROM <object_ref1> [
{
| NATURAL [ { LEFT | RIGHT | FULL } [ OUTER ] ]
| CROSS
}
]
JOIN <object_ref2>
[ ... ]
Parameters¶
object_ref1
andobject_ref2
Each object reference is a table or table-like data source.
JOIN
Use the
JOIN
keyword to specify that the tables should be joined. CombineJOIN
with other join-related keywords (e.g.INNER
orOUTER
) to specify the type of join.The semantics of joins are as follows (for brevity, this topic uses
o1
ando2
forobject_ref1
andobject_ref2
, respectively).Join Type
Semantics
o1 INNER JOIN o2
For each row of
o1
, a row is produced for each row ofo2
that matches according to theON condition
subclause. (Note that you can also use a comma to specify an inner join. For an example, see the examples section below.) If you useINNER JOIN
without theON
clause (or if you use comma without aWHERE
clause), the result is the same as usingCROSS JOIN
: a Cartesian product (every row ofo1
paired with every row ofo2
).o1 LEFT OUTER JOIN o2
The result of the inner join is augmented with a row for each row of
o1
that has no matches ino2
. The result columns referencingo2
contain null.o1 RIGHT OUTER JOIN o2
The result of the inner join is augmented with a row for each row of
o2
that has no matches ino1
. The result columns referencingo1
contain null.o1 FULL OUTER JOIN o2
Returns all joined rows, plus one row for each unmatched left side row (extended with nulls on the right), plus one row for each unmatched right side row (extended with nulls on the left).
o1 CROSS JOIN o2
For every possible combination of rows from
o1
ando2
(i.e. Cartesian product), the joined table contains a row consisting of all columns ino1
followed by all columns ino2
. ACROSS JOIN
cannot be combined with anON condition
clause. However, you can use aWHERE
clause to filter the results.o1 NATURAL JOIN o2
A
NATURAL JOIN
is identical to an explicitJOIN
on the common columns of the two tables, except that the common columns are included only once in the output. (A natural join assumes that columns with the same name, but in different tables, contain corresponding data.) See the Examples section below for some examples. ANATURAL JOIN
can be combined with anOUTER JOIN
. ANATURAL JOIN
cannot be combined with anON condition
clause because theJOIN
condition is already implied. However, you can use aWHERE
clause to filter the results.See also:
Default:
INNER JOIN
If the word
JOIN
is used without specifyingINNER
orOUTER
, then theJOIN
is an inner join.ON condition
A boolean expression that defines the rows from the two sides of the
JOIN
that are considered to match, for example:ON object_ref2.id_number = object_ref1.id_number
Conditions are discussed in more detail in the WHERE clause documentation.
The
ON
clause is prohibited forCROSS JOIN
.The
ON
clause is unnecessary (and prohibited) forNATURAL JOIN
; the join columns are implied.For other joins, the
ON
clause is optional. However, omitting theON
clause results in a Cartesian product (every row ofobject_ref1
paired with every row ofobject_ref2
). A Cartesian product can produce a very large volume of output, almost all of which consists of pairs of rows that aren’t actually related; this consumes a lot of resources and is often a user error.USING( <column_list> )
A list of columns in common between the two tables being joined; these columns are used as the join columns. The columns must have the same name and meaning in each of the tables being joined.
For example, suppose that the SQL statement contains:
... o1 JOIN o2 USING (key_column)
In the simple case, this would be equivalent to:
... o1 JOIN o2 ON o2.key_column = o1.key_column
In the standard JOIN syntax, the projection list (the list of columns and other expressions after the SELECT keyword) is “*”. This causes the query to return the
key_column
exactly once. The columns are returned in the following order:The columns in the USING clause in the order specified.
The left table columns not specified in the USING clause.
The right table columns not specified in the USING clause.
For examples of standard and non-standard usage, see the examples below.
Usage notes¶
The following restrictions apply to table functions other than SQL UDTFs:
You cannot specify the ON, USING, or NATURAL JOIN clause in a lateral table function (other than a SQL UDTF).
For example, the following syntax is not allowed:
SELECT ... FROM my_table JOIN TABLE(FLATTEN(input=>[col_a])) ON ... ;
SELECT ... FROM my_table INNER JOIN TABLE(FLATTEN(input=>[col_a])) ON ... ;
SELECT ... FROM my_table JOIN TABLE(my_js_udtf(col_a)) ON ... ;
SELECT ... FROM my_table INNER JOIN TABLE(my_js_udtf(col_a)) ON ... ;
You cannot specify the ON, USING, or NATURAL JOIN clause in an outer lateral join to a table function (other than a SQL UDTF).
For example, the following syntax is not allowed:
SELECT ... FROM my_table LEFT JOIN TABLE(FLATTEN(input=>[a])) ON ... ;
SELECT ... FROM my_table FULL JOIN TABLE(FLATTEN(input=>[a])) ON ... ;
SELECT ... FROM my_table LEFT JOIN TABLE(my_js_udtf(a)) ON ... ;
SELECT ... FROM my_table FULL JOIN TABLE(my_js_udtf(a)) ON ... ;
Using this syntax above results in the following error:
000002 (0A000): Unsupported feature 'lateral table function called with OUTER JOIN syntax or a join predicate (ON clause)'
These restrictions do not apply if you are using a comma, rather than a JOIN keyword:
SELECT ... FROM my_table, TABLE(FLATTEN(input=>[col_a])) ON ... ;
Examples¶
Many of the JOIN
examples use two tables, t1
and t2
. The tables and their data are created as shown below:
CREATE TABLE t1 (col1 INTEGER); CREATE TABLE t2 (col1 INTEGER);INSERT INTO t1 (col1) VALUES (2), (3), (4); INSERT INTO t2 (col1) VALUES (1), (2), (2), (3);
Inner join:
SELECT t1.col1, t2.col1 FROM t1 INNER JOIN t2 ON t2.col1 = t1.col1 ORDER BY 1,2; +------+------+ | COL1 | COL1 | |------+------| | 2 | 2 | | 2 | 2 | | 3 | 3 | +------+------+
This shows a left outer join. Note the NULL value for the row in table t1 that doesn’t have a matching row in table t2.
SELECT t1.col1, t2.col1 FROM t1 LEFT OUTER JOIN t2 ON t2.col1 = t1.col1 ORDER BY 1,2; +------+------+ | COL1 | COL1 | |------+------| | 2 | 2 | | 2 | 2 | | 3 | 3 | | 4 | NULL | +------+------+
This shows a right outer join. Note the NULL value for the row in table t1 that doesn’t have a matching row in table t2.
SELECT t1.col1, t2.col1 FROM t1 RIGHT OUTER JOIN t2 ON t2.col1 = t1.col1 ORDER BY 1,2; +------+------+ | COL1 | COL1 | |------+------| | 2 | 2 | | 2 | 2 | | 3 | 3 | | NULL | 1 | +------+------+
This shows a full outer join. Note that because each table has a row that doesn’t have a matching row in the other table, the output contains two rows with NULL values:
SELECT t1.col1, t2.col1 FROM t1 FULL OUTER JOIN t2 ON t2.col1 = t1.col1 ORDER BY 1,2; +------+------+ | COL1 | COL1 | |------+------| | 2 | 2 | | 2 | 2 | | 3 | 3 | | 4 | NULL | | NULL | 1 | +------+------+
Here is an example of a cross join, which produces a Cartesian product. Note that the cross join does not have an ON clause.
SELECT t1.col1, t2.col1 FROM t1 CROSS JOIN t2 ORDER BY 1, 2; +------+------+ | COL1 | COL1 | |------+------| | 2 | 1 | | 2 | 2 | | 2 | 2 | | 2 | 3 | | 3 | 1 | | 3 | 2 | | 3 | 2 | | 3 | 3 | | 4 | 1 | | 4 | 2 | | 4 | 2 | | 4 | 3 | +------+------+
A cross join can be filtered by a WHERE
clause, as shown in the example
below:
SELECT t1.col1, t2.col1 FROM t1 CROSS JOIN t2 WHERE t2.col1 = t1.col1 ORDER BY 1, 2; +------+------+ | COL1 | COL1 | |------+------| | 2 | 2 | | 2 | 2 | | 3 | 3 | +------+------+
This is an example of a natural join. This produces the same output as the corresponding inner join, except that the output doesn’t include a second copy of the join column:
CREATE OR REPLACE TABLE d1 ( id number, name string ); +--------------------------------+ | status | |--------------------------------| | Table D1 successfully created. | +--------------------------------+ INSERT INTO d1 (id, name) VALUES (1,'a'), (2,'b'), (4,'c'); +-------------------------+ | number of rows inserted | |-------------------------| | 3 | +-------------------------+ CREATE OR REPLACE TABLE d2 ( id number, value string ); +--------------------------------+ | status | |--------------------------------| | Table D2 successfully created. | +--------------------------------+ INSERT INTO d2 (id, value) VALUES (1,'xx'), (2,'yy'), (5,'zz'); +-------------------------+ | number of rows inserted | |-------------------------| | 3 | +-------------------------+ SELECT * FROM d1 NATURAL INNER JOIN d2 ORDER BY id; +----+------+-------+ | ID | NAME | VALUE | |----+------+-------| | 1 | a | xx | | 2 | b | yy | +----+------+-------+
Natural joins can be combined with outer joins, for example:
SELECT * FROM d1 NATURAL FULL OUTER JOIN d2 ORDER BY ID; +----+------+-------+ | ID | NAME | VALUE | |----+------+-------| | 1 | a | xx | | 2 | b | yy | | 4 | c | NULL | | 5 | NULL | zz | +----+------+-------+
Joins can be combined in the FROM
clause. The following code creates a third table, then chains together two JOINs in
the FROM clause:
CREATE TABLE t3 (col1 INTEGER); INSERT INTO t3 (col1) VALUES (2), (6);SELECT t1.*, t2.*, t3.* FROM t1 LEFT OUTER JOIN t2 ON (t1.col1 = t2.col1) RIGHT OUTER JOIN t3 ON (t3.col1 = t2.col1) ORDER BY t1.col1; +------+------+------+ | COL1 | COL1 | COL1 | |------+------+------| | 2 | 2 | 2 | | 2 | 2 | 2 | | NULL | NULL | 6 | +------+------+------+
In such a query, the results are determined based on the joins taking place from left to right (though the optimizer might reorder the joins if a different join order will produce the same result). If the right outer join is meant to take place before the left outer join, then the query can be written as follows:
SELECT t1.*, t2.*, t3.* FROM t1 LEFT OUTER JOIN (t2 RIGHT OUTER JOIN t3 ON (t3.col1 = t2.col1)) ON (t1.col1 = t2.col1) ORDER BY t1.col1; +------+------+------+ | COL1 | COL1 | COL1 | |------+------+------| | 2 | 2 | 2 | | 2 | 2 | 2 | | 3 | NULL | NULL | | 4 | NULL | NULL | +------+------+------+
The two examples below show standard (ISO 9075) and non-standard usage of
the USING
clause. Both are supported by Snowflake.
This first example shows standard usage. Specifically, the projection list contains exactly “*”. Even though the example query joins two tables, and each table has one column, and the query asks for all columns, the output contains one column, not two.
WITH
l AS (
SELECT 'a' AS userid
),
r AS (
SELECT 'b' AS userid
)
SELECT *
FROM l LEFT JOIN r USING(userid)
;
+--------+
| USERID |
|--------|
| a |
+--------+
The following example shows non-standard usage. The projection list contains something other than “*”. The output contains two columns, and the second column contains either a value from the second table or NULL.
WITH
l AS (
SELECT 'a' AS userid
),
r AS (
SELECT 'b' AS userid
)
SELECT l.userid as UI_L,
r.userid as UI_R
FROM l LEFT JOIN r USING(userid)
;
+------+------+
| UI_L | UI_R |
|------+------|
| a | NULL |
+------+------+