Categories:

Query Syntax

# CONNECT BY¶

Joins a table to itself to process hierarchical data in the table. The CONNECT BY subclause of the FROM clause iterates to process the data.

For example, you can create a query that shows a “parts explosion” to recursively lists a component and the sub-components of that component.

The Snowflake syntax for CONNECT BY is mostly compatible with the Oracle syntax.

WITH

## Syntax¶

The general form of a statement with CONNECT BY is similar to the following (some variations in order are allowed, but are not shown):

SELECT <column_list> [ , <level_expression> ]
FROM <data_source>
CONNECT BY [ PRIOR ] <col1_identifier> = [ PRIOR ] <col2_identifier>
[ , [ PRIOR ] <col3_identifier> = [ PRIOR ] <col4_identifier> ]
...
...

column_list

This generally follows the rules for the projection clause of a SELECT statement.

level_expression

CONNECT BY queries allow some pseudo-columns. One of those pseudo-columns is LEVEL, which indicates the current level of the hierarchy (where level 1 represents the top of the hierarchy). The projection clause of the query can use LEVEL as a column.

data_source

The data source is usually a table, but can be another table-like data source, such as a view, UDTF, etc.

predicate

The predicate is an expression that selects the first “level” of the hierarchy (e.g. the president of the company or the top-level component in a parts explosion). The predicate should look similar to a WHERE clause, but without the keyword WHERE.

See the Examples section (in this topic) for predicate examples.

colN_identifier

The CONNECT BY clause should contain one or more expressions similar to those used in joins. Specifically, a column in the “current” level of the table should refer to a column in the “prior” (higher) level of the table.

For example, in a manager/employee hierarchy, the clause might look similar to:

... CONNECT BY manager_ID = PRIOR employee_ID ...


The keyword PRIOR indicates that the value should be taken from the prior (higher/parent) level.

In this example, the current employee’s manager_ID should match the prior level’s employee_ID.

The CONNECT BY clause can contain more than one such expression, for example:

... CONNECT BY y = PRIOR x AND b = PRIOR a ...


Each expression similar to the following should have exactly one occurrence of the keyword PRIOR:

CONNECT BY <col_1_identifier> = <col_2_identifier>


For example:

CONNECT BY <col_1_identifier> = PRIOR <col_2_identifier>


or

CONNECT BY PRIOR <col_1_identifier> = <col_2_identifier>


## Usage Notes¶

• A CONNECT BY clause always joins a table to itself, not to another table.

• By default, to prevent queries from infinite looping, there is an upper limit on the number of iterations.

• Some variations within the projection clause are valid. Although the syntax shows level_expression occurring after the column_list, the level expression(s) can occur in any order.

• The keyword PRIOR should occur exactly once in each CONNECT BY expression. PRIOR can occur on either the left-hand side or the right-hand side of the expression, but not on both.

• The Snowflake implementation of CONNECT BY is mostly compatible with the Oracle implementation; however, Snowflake does not support:

• NOCYCLE

• CONNECT_BY_ISCYCLE

• CONNECT_BY_ISLEAF

• Snowflake supports the function SYS_CONNECT_BY_PATH when used with the CONNECT BY clause. SYS_CONNECT_BY_PATH returns a string that contains the path from the root to the current element. An example is included in the Examples section below.

• Snowflake supports the CONNECT_BY_ROOT operator when used with the CONNECT BY clause. The CONNECT_BY_ROOT operator allows the current level to use information from the root level of the hierarchy, even if the root level is not the immediate parent of the current level. An example is included in the Examples section below.

• Although in theory the CONNECT BY clause can iterate as many times as necessary to process the data, Snowflake currently limits the number of iterations to 100 to prevent the query from running indefinitely if the query is constructed improperly.

## Examples¶

This example uses a CONNECT BY to show the management hierarchy in a table of employee information. The table and data are shown below:

CREATE OR REPLACE TABLE employees (title VARCHAR, employee_ID INTEGER, manager_ID INTEGER);

INSERT INTO employees (title, employee_ID, manager_ID) VALUES
('President', 1, NULL),  -- The President has no manager.
('Vice President Engineering', 10, 1),
('Programmer', 100, 10),
('QA Engineer', 101, 10),
('Vice President HR', 20, 1),
('Health Insurance Analyst', 200, 20);


The query and output are shown below:

SELECT employee_ID, manager_ID, title
FROM employees
CONNECT BY
manager_ID = PRIOR employee_id
ORDER BY employee_ID;
+-------------+------------+----------------------------+
| EMPLOYEE_ID | MANAGER_ID | TITLE                      |
|-------------+------------+----------------------------|
|           1 |       NULL | President                  |
|          10 |          1 | Vice President Engineering |
|          20 |          1 | Vice President HR          |
|         100 |         10 | Programmer                 |
|         101 |         10 | QA Engineer                |
|         200 |         20 | Health Insurance Analyst   |
+-------------+------------+----------------------------+


This example uses the SYS_CONNECT_BY_PATH function to show the hierarchy from the President down to the current employee:

SELECT SYS_CONNECT_BY_PATH(title, ' -> '), employee_ID, manager_ID, title
FROM employees
CONNECT BY
manager_ID = PRIOR employee_id
ORDER BY employee_ID;
+----------------------------------------------------------------+-------------+------------+----------------------------+
| SYS_CONNECT_BY_PATH(TITLE, ' -> ')                             | EMPLOYEE_ID | MANAGER_ID | TITLE                      |
|----------------------------------------------------------------+-------------+------------+----------------------------|
|  -> President                                                  |           1 |       NULL | President                  |
|  -> President -> Vice President Engineering                    |          10 |          1 | Vice President Engineering |
|  -> President -> Vice President HR                             |          20 |          1 | Vice President HR          |
|  -> President -> Vice President Engineering -> Programmer      |         100 |         10 | Programmer                 |
|  -> President -> Vice President Engineering -> QA Engineer     |         101 |         10 | QA Engineer                |
|  -> President -> Vice President HR -> Health Insurance Analyst |         200 |         20 | Health Insurance Analyst   |
+----------------------------------------------------------------+-------------+------------+----------------------------+


This example uses the CONNECT_BY_ROOT keyword to display information from the top of the hierarchy in each row of output:

SELECT
employee_ID, manager_ID, title,
CONNECT_BY_ROOT title AS root_title
FROM employees
CONNECT BY
manager_ID = PRIOR employee_id
ORDER BY employee_ID;
+-------------+------------+----------------------------+------------+
| EMPLOYEE_ID | MANAGER_ID | TITLE                      | ROOT_TITLE |
|-------------+------------+----------------------------+------------|
|           1 |       NULL | President                  | President  |
|          10 |          1 | Vice President Engineering | President  |
|          20 |          1 | Vice President HR          | President  |
|         100 |         10 | Programmer                 | President  |
|         101 |         10 | QA Engineer                | President  |
|         200 |         20 | Health Insurance Analyst   | President  |
+-------------+------------+----------------------------+------------+


This example uses a CONNECT BY to show a “parts explosion”:

Here is the data:

-- The components of a car.
CREATE TABLE components (
description VARCHAR,
quantity INTEGER,
component_ID INTEGER,
parent_component_ID INTEGER
);

INSERT INTO components (description, quantity, component_ID, parent_component_ID) VALUES
('car', 1, 1, 0),
('wheel', 4, 11, 1),
('tire', 1, 111, 11),
('#112 bolt', 5, 112, 11),
('brake', 1, 113, 11),
('engine', 1, 12, 1),
('piston', 4, 121, 12),
('cylinder block', 1, 122, 12),
('#112 bolt', 16, 112, 12)   -- Can use same type of bolt in multiple places
;


Here are the query and output:

SELECT
description,
quantity,
component_id,
parent_component_ID,
SYS_CONNECT_BY_PATH(component_ID, ' -> ') AS path
FROM components
CONNECT BY
parent_component_ID = PRIOR component_ID
ORDER BY path
;
+----------------+----------+--------------+---------------------+----------------------------+
| DESCRIPTION    | QUANTITY | COMPONENT_ID | PARENT_COMPONENT_ID | PATH                       |
|----------------+----------+--------------+---------------------+----------------------------|
| car            |        1 |            1 |                   0 |  -> 1                      |
| wheel          |        4 |           11 |                   1 |  -> 1 -> 11                |
| tire           |        1 |          111 |                  11 |  -> 1 -> 11 -> 111         |
| #112 bolt      |        5 |          112 |                  11 |  -> 1 -> 11 -> 112         |
| brake          |        1 |          113 |                  11 |  -> 1 -> 11 -> 113         |
| brake pad      |        1 |         1131 |                 113 |  -> 1 -> 11 -> 113 -> 1131 |
| engine         |        1 |           12 |                   1 |  -> 1 -> 12                |
| #112 bolt      |       16 |          112 |                  12 |  -> 1 -> 12 -> 112         |
| piston         |        4 |          121 |                  12 |  -> 1 -> 12 -> 121         |
| cylinder block |        1 |          122 |                  12 |  -> 1 -> 12 -> 122         |
+----------------+----------+--------------+---------------------+----------------------------+