- Categories:
CONNECT BY¶
Joins a table to itself to process hierarchical data in the table. The CONNECT BY
subclause of the
FROM clause iterates to process the data.
For example, you can create a query that shows a “parts explosion” to recursively lists a component and the sub-components of that component.
The Snowflake syntax for CONNECT BY is mostly compatible with the Oracle syntax.
- See also:
Syntax¶
The general form of a statement with CONNECT BY is similar to the following (some variations in order are allowed, but are not shown):
SELECT <column_list> [ , <level_expression> ]
FROM <data_source>
START WITH <predicate>
CONNECT BY [ PRIOR ] <col1_identifier> = [ PRIOR ] <col2_identifier>
[ , [ PRIOR ] <col3_identifier> = [ PRIOR ] <col4_identifier> ]
...
...
column_list
This generally follows the rules for the projection clause of a SELECT statement.
level_expression
CONNECT BY queries allow some pseudo-columns. One of those pseudo-columns is
LEVEL
, which indicates the current level of the hierarchy (where level 1 represents the top of the hierarchy). The projection clause of the query can use LEVEL as a column.data_source
The data source is usually a table, but can be another table-like data source, such as a view, UDTF, etc.
predicate
The predicate is an expression that selects the first “level” of the hierarchy (e.g. the president of the company or the top-level component in a parts explosion). The predicate should look similar to a WHERE clause, but without the keyword
WHERE
.See the Examples section (in this topic) for predicate examples.
colN_identifier
The CONNECT BY clause should contain one or more expressions similar to those used in joins. Specifically, a column in the “current” level of the table should refer to a column in the “prior” (higher) level of the table.
For example, in a manager/employee hierarchy, the clause might look similar to:
... CONNECT BY manager_ID = PRIOR employee_ID ...
The keyword PRIOR indicates that the value should be taken from the prior (higher/parent) level.
In this example, the current employee’s manager_ID should match the prior level’s employee_ID.
The CONNECT BY clause can contain more than one such expression, for example:
... CONNECT BY y = PRIOR x AND b = PRIOR a ...
Each expression similar to the following should have exactly one occurrence of the keyword PRIOR:
CONNECT BY <col_1_identifier> = <col_2_identifier>
For example:
CONNECT BY <col_1_identifier> = PRIOR <col_2_identifier>
or
CONNECT BY PRIOR <col_1_identifier> = <col_2_identifier>
Usage Notes¶
A CONNECT BY clause always joins a table to itself, not to another table.
By default, to prevent queries from infinite looping, there is an upper limit on the number of iterations.
Some variations within the projection clause are valid. Although the syntax shows
level_expression
occurring after thecolumn_list
, the level expression(s) can occur in any order.The keyword
PRIOR
should occur exactly once in each CONNECT BY expression.PRIOR
can occur on either the left-hand side or the right-hand side of the expression, but not on both.The Snowflake implementation of CONNECT BY is mostly compatible with the Oracle implementation; however, Snowflake does not support:
NOCYCLE
CONNECT_BY_ISCYCLE
CONNECT_BY_ISLEAF
Snowflake supports the function
SYS_CONNECT_BY_PATH
when used with theCONNECT BY
clause.SYS_CONNECT_BY_PATH
returns a string that contains the path from the root to the current element. An example is included in the Examples section below.
Snowflake supports the
CONNECT_BY_ROOT
operator when used with theCONNECT BY
clause. TheCONNECT_BY_ROOT
operator allows the current level to use information from the root level of the hierarchy, even if the root level is not the immediate parent of the current level. An example is included in the Examples section below.Although in theory the
CONNECT BY
clause can iterate as many times as necessary to process the data, Snowflake currently limits the number of iterations to 100 to prevent the query from running indefinitely if the query is constructed improperly.
Examples¶
This example uses a CONNECT BY to show the management hierarchy in a table of employee information. The table and data are shown below:
CREATE OR REPLACE TABLE employees (title VARCHAR, employee_ID INTEGER, manager_ID INTEGER);INSERT INTO employees (title, employee_ID, manager_ID) VALUES ('President', 1, NULL), -- The President has no manager. ('Vice President Engineering', 10, 1), ('Programmer', 100, 10), ('QA Engineer', 101, 10), ('Vice President HR', 20, 1), ('Health Insurance Analyst', 200, 20);The query and output are shown below:
SELECT employee_ID, manager_ID, title FROM employees START WITH title = 'President' CONNECT BY manager_ID = PRIOR employee_id ORDER BY employee_ID; +-------------+------------+----------------------------+ | EMPLOYEE_ID | MANAGER_ID | TITLE | |-------------+------------+----------------------------| | 1 | NULL | President | | 10 | 1 | Vice President Engineering | | 20 | 1 | Vice President HR | | 100 | 10 | Programmer | | 101 | 10 | QA Engineer | | 200 | 20 | Health Insurance Analyst | +-------------+------------+----------------------------+
This example uses the SYS_CONNECT_BY_PATH
function to show the hierarchy from the President down to the
current employee:
SELECT SYS_CONNECT_BY_PATH(title, ' -> '), employee_ID, manager_ID, title FROM employees START WITH title = 'President' CONNECT BY manager_ID = PRIOR employee_id ORDER BY employee_ID; +----------------------------------------------------------------+-------------+------------+----------------------------+ | SYS_CONNECT_BY_PATH(TITLE, ' -> ') | EMPLOYEE_ID | MANAGER_ID | TITLE | |----------------------------------------------------------------+-------------+------------+----------------------------| | -> President | 1 | NULL | President | | -> President -> Vice President Engineering | 10 | 1 | Vice President Engineering | | -> President -> Vice President HR | 20 | 1 | Vice President HR | | -> President -> Vice President Engineering -> Programmer | 100 | 10 | Programmer | | -> President -> Vice President Engineering -> QA Engineer | 101 | 10 | QA Engineer | | -> President -> Vice President HR -> Health Insurance Analyst | 200 | 20 | Health Insurance Analyst | +----------------------------------------------------------------+-------------+------------+----------------------------+
This example uses the CONNECT_BY_ROOT
keyword to display information from the top of the hierarchy in each row
of output:
SELECT employee_ID, manager_ID, title, CONNECT_BY_ROOT title AS root_title FROM employees START WITH title = 'President' CONNECT BY manager_ID = PRIOR employee_id ORDER BY employee_ID; +-------------+------------+----------------------------+------------+ | EMPLOYEE_ID | MANAGER_ID | TITLE | ROOT_TITLE | |-------------+------------+----------------------------+------------| | 1 | NULL | President | President | | 10 | 1 | Vice President Engineering | President | | 20 | 1 | Vice President HR | President | | 100 | 10 | Programmer | President | | 101 | 10 | QA Engineer | President | | 200 | 20 | Health Insurance Analyst | President | +-------------+------------+----------------------------+------------+
This example uses a CONNECT BY to show a “parts explosion”:
Here is the data:
-- The components of a car. CREATE TABLE components ( description VARCHAR, quantity INTEGER, component_ID INTEGER, parent_component_ID INTEGER ); INSERT INTO components (description, quantity, component_ID, parent_component_ID) VALUES ('car', 1, 1, 0), ('wheel', 4, 11, 1), ('tire', 1, 111, 11), ('#112 bolt', 5, 112, 11), ('brake', 1, 113, 11), ('brake pad', 1, 1131, 113), ('engine', 1, 12, 1), ('piston', 4, 121, 12), ('cylinder block', 1, 122, 12), ('#112 bolt', 16, 112, 12) -- Can use same type of bolt in multiple places ;Here are the query and output:
SELECT description, quantity, component_id, parent_component_ID, SYS_CONNECT_BY_PATH(component_ID, ' -> ') AS path FROM components START WITH component_ID = 1 CONNECT BY parent_component_ID = PRIOR component_ID ORDER BY path ; +----------------+----------+--------------+---------------------+----------------------------+ | DESCRIPTION | QUANTITY | COMPONENT_ID | PARENT_COMPONENT_ID | PATH | |----------------+----------+--------------+---------------------+----------------------------| | car | 1 | 1 | 0 | -> 1 | | wheel | 4 | 11 | 1 | -> 1 -> 11 | | tire | 1 | 111 | 11 | -> 1 -> 11 -> 111 | | #112 bolt | 5 | 112 | 11 | -> 1 -> 11 -> 112 | | brake | 1 | 113 | 11 | -> 1 -> 11 -> 113 | | brake pad | 1 | 1131 | 113 | -> 1 -> 11 -> 113 -> 1131 | | engine | 1 | 12 | 1 | -> 1 -> 12 | | #112 bolt | 16 | 112 | 12 | -> 1 -> 12 -> 112 | | piston | 4 | 121 | 12 | -> 1 -> 12 -> 121 | | cylinder block | 1 | 122 | 12 | -> 1 -> 12 -> 122 | +----------------+----------+--------------+---------------------+----------------------------+