Categories:

Aggregate functions (Percentile Estimation) , Window functions

APPROX_PERCENTILE¶

Returns an approximated value for the desired percentile (i.e. if column c has n numbers, then APPROX_PERCENTILE(c, p) returns a number such that approximately n * p of the numbers in c are smaller than the returned number).

This function uses an improved version of the t-Digest algorithm. For more information, see Estimating Percentile Values.

See also:

APPROX_PERCENTILE_ACCUMULATE , APPROX_PERCENTILE_COMBINE , APPROX_PERCENTILE_ESTIMATE

Syntax¶

Aggregate function

APPROX_PERCENTILE( <expr> , <percentile> )
Copy

Window function

APPROX_PERCENTILE( <expr> , <percentile> ) OVER ( [ PARTITION BY <expr3> ] )
Copy

Arguments¶

expr

A valid expression, such as a column name, that evaluates to a numeric value.

percentile

A constant real value greater than or equal to 0.0 and less than 1.0. This indicates the percentile (from 0 to 99.999…). E.g. The value 0.65 indicates the 65th percentile.

expr3

This is the optional expression used to group rows into partitions.

Returns¶

The output is returned as a DOUBLE value.

Usage notes¶

  • Percentile works only on numeric values, so expr should produce values that are numbers or can be cast to numbers.

  • The values returned are not necessarily in the data set.

  • The value returned is an approximation. The size of the data set and the skew in the data set affect the accuracy of the approximation.

  • When used as a window function:

    • This function does not support:

      • ORDER BY sub-clause in the OVER() clause.

      • Window frames.

Examples¶

Demonstrate the APPROX_PERCENTILE function:

Create and populate a table with values:

CREATE TABLE testtable (c1 INTEGER);
INSERT INTO testtable (c1) VALUES 
    (0),
    (1),
    (2),
    (3),
    (4),
    (5),
    (6),
    (7),
    (8),
    (9),
    (10);
Copy

Run queries and show the output:

SELECT APPROX_PERCENTILE(c1, 0.1) FROM testtable;
+----------------------------+
| APPROX_PERCENTILE(C1, 0.1) |
|----------------------------|
|                        1.5 |
+----------------------------+
Copy
SELECT APPROX_PERCENTILE(c1, 0.5) FROM testtable;
+----------------------------+
| APPROX_PERCENTILE(C1, 0.5) |
|----------------------------|
|                        5.5 |
+----------------------------+
Copy

Note that the value returned in this case is higher than any value actually in the data set:

SELECT APPROX_PERCENTILE(c1, 0.999) FROM testtable;
+------------------------------+
| APPROX_PERCENTILE(C1, 0.999) |
|------------------------------|
|                         10.5 |
+------------------------------+
Copy