- Categories:
GROUP BY¶
Groups rows with the same group-by-item expressions and computes aggregate functions for the resulting group. A GROUP BY expression can be a column name, a number referencing a position in the SELECT list, or a general expression.
- Extensions:
Syntax¶
SELECT ...
FROM ...
[ ... ]
GROUP BY groupItem [ , groupItem> [ , ... ] ]
[ ... ]
Where:
groupItem ::= { <column_alias> | <position> | <expr> }
Usage Notes¶
A GROUP BY clause can reference expressions in the projection clause by name or by position. If the GROUP BY clause references by name, then each reference is resolved as follows:
If the query contains a database object (e.g. table or view) with a matching column name, then the reference is resolved to the column name.
Otherwise, if the projection clause of the SELECT contains an expression alias with a matching name, then the reference is resolved to the alias.
For an example, see Demonstrate Precedence When a Column Name and an Alias Match.
Examples¶
Group By One Column¶
This example shows the gross revenue per product, grouped by product_id
:
-- Create the sales table and insert some rows. CREATE TABLE sales (product_ID INTEGER, retail_price REAL, quantity INTEGER, city VARCHAR, state VARCHAR); INSERT INTO sales (product_id, retail_price, quantity, city, state) VALUES (1, 2.00, 1, 'SF', 'CA'), (1, 2.00, 2, 'SJ', 'CA'), (2, 5.00, 4, 'SF', 'CA'), (2, 5.00, 8, 'SJ', 'CA'), (2, 5.00, 16, 'Miami', 'FL'), (2, 5.00, 32, 'Orlando', 'FL'), (2, 5.00, 64, 'SJ', 'PR'); -- Show the gross revenue, grouped by product_id. -- In other words, show the total amount of money received for -- selling each product. SELECT product_ID, SUM(retail_price * quantity) AS gross_revenue FROM sales GROUP BY product_ID;Output:
+------------+---------------+ | PRODUCT_ID | GROSS_REVENUE | +============+===============+ | 1 | 6 | +------------+---------------+ | 2 | 620 | +------------+---------------+
This example builds on the previous example, showing the net profit per product, grouped by product_id
:
-- Create the products table and insert some rows. CREATE TABLE products (product_ID INTEGER, wholesale_price REAL); INSERT INTO products (product_ID, wholesale_price) VALUES (1, 1.00); INSERT INTO products (product_ID, wholesale_price) VALUES (2, 2.00); -- Show the net profit on each product. SELECT p.product_ID, SUM((s.retail_price - p.wholesale_price) * s.quantity) AS profit FROM products AS p, sales AS s WHERE s.product_ID = p.product_ID GROUP BY p.product_ID;Output:
+------------+--------+ | PRODUCT_ID | PROFIT | +============+========+ | 1 | 3 | +------------+--------+ | 2 | 372 | +------------+--------+
Group By Multiple Columns¶
You can group by multiple columns:
SELECT State, City, SUM(retail_price * quantity) AS gross_revenue FROM sales GROUP BY State, City;Output:
+-------+---------+---------------+ | STATE | CITY | GROSS REVENUE | +=======+=========+===============+ | CA | SF | 22 | +-------+---------+---------------+ | CA | SJ | 44 | +-------+---------+---------------+ | FL | Miami | 80 | +-------+---------+---------------+ | FL | Orlando | 160 | +-------+---------+---------------+ | PR | SJ | 320 | +-------+---------+---------------+
Demonstrate Precedence When a Column Name and an Alias Match¶
It is possible (but usually a bad idea) to create a query that contains an alias that matches a column name:
select x, some_expression as x from ...
If a GROUP BY clause contains a name that matches both a column name and an alias, then the GROUP BY clause uses the column name. This is demonstrated in the example below.
Create a table and insert rows:
Create table employees (salary float, state varchar, employment_state varchar); insert into employees (salary, state, employment_state) values (60000, 'California', 'Active'), (70000, 'California', 'On leave'), (80000, 'Oregon', 'Active');
The query below returns the sum of the salaries of the employees who are active and the sum of the salaries of the employees who are on leave.
select sum(salary), ANY_VALUE(employment_state) from employees group by employment_state; +-------------+-----------------------------+ | SUM(SALARY) | ANY_VALUE(EMPLOYMENT_STATE) | |-------------+-----------------------------| | 140000 | Active | | 70000 | On leave | +-------------+-----------------------------+
The query below uses the alias state
, which matches the name of a column of a table in the query. When state
is used in
the GROUP BY, Snowflake interprets it as a reference to the column name, not the alias. This query therefore returns the sum of
the salaries of the employees in the state of California and the sum of the salaries of the employees in the state of Oregon,
yet displays employment_state information (e.g. “Active”) rather than the names of states or provinces.
select sum(salary), ANY_VALUE(employment_state) as state from employees group by state; +-------------+--------+ | SUM(SALARY) | STATE | |-------------+--------| | 130000 | Active | | 80000 | Active | +-------------+--------+