modin.pandas.DataFrame.quantile¶

DataFrame.quantile(q: Scalar | ListLike = 0.5, axis: Axis = 0, numeric_only: bool = False, interpolation: Literal['linear', 'lower', 'higher', 'midpoint', 'nearest'] = 'linear', method: Literal['single', 'table'] = 'single')[source]¶

Return values at the given quantile over requested axis.

Parameters:

q (float or array-like of float, default 0.5) – Value between 0 <= q <= 1, the quantile(s) to compute.
axis ({0 or 'index', 1 or 'columns'}, default 0) – Axis across which to compute quantiles.
numeric_only (bool, default False) – Include only data where is_numeric_dtype is true. When True, bool columns are included, but attempting to compute quantiles across bool values is an ill-defined error in both pandas and Snowpark pandas.
interpolation ({"linear", "lower", "higher", "midpoint", "nearest"}, default "linear") –
Specifies the interpolation method to use if a quantile lies between two data points i and j:
- linear: i + (j - i) * fraction, where fraction is the fractional part of the index surrounded by i and j.
- lower: i.
- higher: j.
- nearest: i or j, whichever is nearest.
- midpoint: (i + j) / 2.
Snowpark pandas currently only supports “linear” and “nearest”.
method ({"single", "table"}, default "single") – Whether to compute quantiles per-column (“single”) or over all columns (“table”). When “table”, the only allowed interpolation methods are “nearest”, “lower”, and “higher”.

Returns:

If q is an array, a DataFrame will be returned where the index is q, the columns are the columns of self, and the values are the quantiles. If q is a float, a Series will be returned where the index is the columns of self and the values are the quantiles.

Return type:

Series or DataFrame

Examples

>>> df = pd.DataFrame(np.array([[1, 1], [2, 10], [3, 100], [4, 100]]), columns=['a', 'b'])

Copy

With a scalar q:

>>> df.quantile(.1) 
a    1.3
b    3.7
Name: 0.1, dtype: float64

Copy

With a list q:

>>> df.quantile([.1, .5]) 
       a     b
0.1  1.3   3.7
0.5  2.5  55.0

Copy

Values considered NaN do not affect the result:

>>> df = pd.DataFrame({"a": [None, 0, 25, 50, 75, 100, np.nan]})
>>> df.quantile([0, 0.25, 0.5, 0.75, 1]) 
          a
0.00    0.0
0.25   25.0
0.50   50.0
0.75   75.0
1.00  100.0

Copy

Notes

Currently only supports calls with axis=0.

Also, unsupported if q is a Snowpandas DataFrame or Series.