You are viewing documentation about an older version (1.18.0). View latest version

modin.pandas.DataFrameGroupBy.idxmax¶

DataFrameGroupBy.idxmax(axis: Union[int, Literal['index', 'columns', 'rows']] = _NoDefault.no_default, skipna: bool = True, numeric_only: bool = False)[source]¶

Return the index of the first occurrence of maximum over requested axis.

NA/null values are excluded based on skipna.

Parameters:

axis ({{0 or 'index', 1 or 'columns'}}, default None) –
The axis to use. 0 or ‘index’ for row-wise, 1 or ‘columns’ for column-wise. If axis is not provided, grouper’s axis is used.

Snowpark pandas does not support axis=1, since it is deprecated in pandas.

Deprecated since version 2.1.0: For axis=1, operate on the underlying object instead. Otherwise, the axis keyword is not necessary.
skipna (bool, default True) – Exclude NA/null values. If an entire row/column is NA, the result will be NA.
numeric_only (bool, default False) – Include only float, int or boolean data.

Returns:

Indexes of maxima along the specified axis.

Return type:

Series

Raises:

ValueError – If the row/column is empty

See also

Series.idxmax: Return index of the maximum element.

Notes

This method is the DataFrame version of ndarray.argmax.

Examples

>>> small_df_data = [
...        ["lion", 78, 50, 50, 50],
...        ["tiger", -35, 12, -378, 1246],
...        ["giraffe", 54, -9, 67, -256],
...        ["hippopotamus", np.nan, -537, -47, -789],
...        ["tiger", 89, 2, 256, 246],
...        ["tiger", -325, 2, 2, 5],
...        ["tiger", 367, -367, 3, -6],
...        ["giraffe", 25, 6, 312, 6],
...        ["lion", -5, -5, -3, -4],
...        ["lion", 15, np.nan, 2, 12],
...        ["giraffe", 100, 200, 300, 400],
...        ["hippopotamus", -100, -300, -600, -200],
...        ["rhino", 26, 2, -45, 14],
...        ["rhino", -7, 63, 257, -257],
...        ["lion", 1, 2, 3, 4],
...        ["giraffe", -5, -6, -7, 8],
...        ["lion", 1234, 456, 78, np.nan],
... ]

>>> df = pd.DataFrame(
...     data=small_df_data,
...     columns=("species", "speed", "age", "weight", "height"),
...     index=list("abcdefghijklmnopq"),
... )

Group by axis=0, apply idxmax on axis=0

>>> df.groupby("species").idxmax(axis=0, skipna=True)  
             speed age weight height
species
giraffe          k   k      h      k
hippopotamus     l   l      d      l
lion             q   q      q      a
rhino            m   n      n      m
tiger            g   b      e      b

>>> df.groupby("species").idxmax(axis=0, skipna=False)  
             speed   age weight height
species
giraffe          k     k      h      k
hippopotamus  None     l      d      l
lion             q  None      q   None
rhino            m     n      n      m
tiger            g     b      e      b