Categories:

Aggregate Functions (Percentile Estimation) , Window Functions

# APPROX_PERCENTILE¶

Returns an approximated value for the desired percentile (i.e. if column c has n numbers, then APPROX_PERCENTILE(c, p) returns a number such that approximately n * p of the numbers in c are smaller than the returned number).

This function uses the t-Digest algorithm. For more information, see Estimating Percentile Values.

## Syntax¶

Aggregate function

APPROX_PERCENTILE( <expr> , <percentile> )


Window function

APPROX_PERCENTILE( <expr> , <percentile> ) OVER ( [ PARTITION BY <expr3> ] )


## Arguments¶

expr

A valid expression, such as a column name, that evaluates to a numeric value.

percentile

A constant real value greater than or equal to 0.0 and less than 1.0. This indicates the percentile (from 0 to 99.999…). E.g. The value 0.65 indicates the 65th percentile.

expr3

This is the optional expression used to group rows into partitions.

## Returns¶

The output is returned as a DOUBLE value.

## Usage Notes¶

• Percentile works only on numeric values, so expr should produce values that are numbers or can be cast to numbers.

• The values returned are not necessarily in the data set.

• The value returned is an approximation. The size of the data set and the skew in the data set affect the accuracy of the approximation.

• When used as a window function:

• This function does not support:

• ORDER BY sub-clause in the OVER() clause.

• Window frames.

## Examples¶

Demonstrate the APPROX_PERCENTILE function:

Create and populate a table with values:

CREATE TABLE testtable (c1 INTEGER);
INSERT INTO testtable (c1) VALUES
(0),
(1),
(2),
(3),
(4),
(5),
(6),
(7),
(8),
(9),
(10);


Run queries and show the output:

SELECT APPROX_PERCENTILE(c1, 0.1) FROM testtable;
+----------------------------+
| APPROX_PERCENTILE(C1, 0.1) |
|----------------------------|
|                        1.5 |
+----------------------------+

SELECT APPROX_PERCENTILE(c1, 0.5) FROM testtable;
+----------------------------+
| APPROX_PERCENTILE(C1, 0.5) |
|----------------------------|
|                        5.5 |
+----------------------------+


Note that the value returned in this case is higher than any value actually in the data set:

SELECT APPROX_PERCENTILE(c1, 0.999) FROM testtable;
+------------------------------+
| APPROX_PERCENTILE(C1, 0.999) |
|------------------------------|
|                         10.5 |
+------------------------------+