modin.pandas.DataFrameGroupBy.mean¶

DataFrameGroupBy.mean(numeric_only: bool = False, engine: Optional[Literal['cython', 'numba']] = None, engine_kwargs: Optional[dict[str, bool]] = None)[source]¶

Compute mean of groups, excluding missing values.

Parameters:
  • numeric_only (bool, default False) – Include only float, int, boolean columns.

  • engine (str, default None) –

    • 'cython' : Runs the operation through C-extensions from cython.

    • 'numba' : Runs the operation through JIT compiled code from numba.

    • None : Defaults to 'cython' or globally setting compute.use_numba

    This parameter is ignored in Snowpark pandas, as the execution is always performed in Snowflake.

  • engine_kwargs (dict, default None) –

    • For 'cython' engine, there are no accepted engine_kwargs

    • For 'numba' engine, the engine can accept nopython, nogil

      and parallel dictionary keys. The values must either be True or False. The default engine_kwargs for the 'numba' engine is {{'nopython': True, 'nogil': False, 'parallel': False}}

    This parameter is ignored in Snowpark pandas, as the execution is always performed in Snowflake.

Return type:

Series or DataFrame

Examples

>>> df = pd.DataFrame({'A': [1, 1, 2, 1, 2],
...                    'B': [np.nan, 2, 3, 4, 5],
...                    'C': [1, 2, 1, 1, 2]}, columns=['A', 'B', 'C'])
Copy

Groupby one column and return the mean of the remaining columns in each group.

>>> df.groupby('A').mean()      
     B         C
A
1  3.0  1.333333
2  4.0  1.500000
Copy

Groupby two columns and return the mean of the remaining column.

>>> df.groupby(['A', 'B']).mean()   
         C
A B
1 2.0  2.0
  4.0  1.0
2 3.0  1.0
  5.0  2.0
Copy

Groupby one column and return the mean of only one particular column in the group.

>>> df.groupby('A')['B'].mean()
A
1    3.0
2    4.0
Name: B, dtype: float64
Copy