- Categories:
Aggregate functions (Percentile Estimation) , Window functions
APPROX_PERCENTILE¶
Returns an approximated value for the desired percentile (i.e. if column c
has n
numbers, then APPROX_PERCENTILE(c, p) returns a number such that approximately n * p
of the numbers in c
are smaller than the returned number).
This function uses an improved version of the t-Digest algorithm. For more information, see Estimating Percentile Values.
Syntax¶
Aggregate function
APPROX_PERCENTILE( <expr> , <percentile> )
Window function
APPROX_PERCENTILE( <expr> , <percentile> ) OVER ( [ PARTITION BY <expr3> ] )
Arguments¶
expr
A valid expression, such as a column name, that evaluates to a numeric value.
percentile
A constant real value greater than or equal to
0.0
and less than1.0
. This indicates the percentile (from 0 to 99.999…). E.g. The value 0.65 indicates the 65th percentile.expr3
This is the optional expression used to group rows into partitions.
Returns¶
The output is returned as a DOUBLE value.
Usage notes¶
Percentile works only on numeric values, so
expr
should produce values that are numbers or can be cast to numbers.The values returned are not necessarily in the data set.
The value returned is an approximation. The size of the data set and the skew in the data set affect the accuracy of the approximation.
When used as a window function:
This function does not support:
ORDER BY sub-clause in the OVER() clause.
Window frames.
Examples¶
Demonstrate the APPROX_PERCENTILE
function:
Create and populate a table with values:
CREATE TABLE testtable (c1 INTEGER); INSERT INTO testtable (c1) VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);Run queries and show the output:
SELECT APPROX_PERCENTILE(c1, 0.1) FROM testtable; +----------------------------+ | APPROX_PERCENTILE(C1, 0.1) | |----------------------------| | 1.5 | +----------------------------+SELECT APPROX_PERCENTILE(c1, 0.5) FROM testtable; +----------------------------+ | APPROX_PERCENTILE(C1, 0.5) | |----------------------------| | 5.5 | +----------------------------+Note that the value returned in this case is higher than any value actually in the data set:
SELECT APPROX_PERCENTILE(c1, 0.999) FROM testtable; +------------------------------+ | APPROX_PERCENTILE(C1, 0.999) | |------------------------------| | 10.5 | +------------------------------+