You are viewing documentation about an older version (1.21.0). View latest version

modin.pandas.DataFrameGroupBy.mean¶

DataFrameGroupBy.mean(numeric_only: bool = False, engine: Optional[Literal['cython', 'numba']] = None, engine_kwargs: Optional[dict[str, bool]] = None)[source]¶

Compute mean of groups, excluding missing values.

Parameters:

numeric_only (bool, default False) – Include only float, int, boolean columns.
engine (str, default None) –
- 'cython' : Runs the operation through C-extensions from cython.
- 'numba' : Runs the operation through JIT compiled code from numba.
- None : Defaults to 'cython' or globally setting compute.use_numba
This parameter is ignored in Snowpark pandas, as the execution is always performed in Snowflake.
engine_kwargs (dict, default None) –
- For 'cython' engine, there are no accepted engine_kwargs
- For 'numba' engine, the engine can accept nopython, nogil
  and parallel dictionary keys. The values must either be True or False. The default engine_kwargs for the 'numba' engine is {{'nopython': True, 'nogil': False, 'parallel': False}}
This parameter is ignored in Snowpark pandas, as the execution is always performed in Snowflake.

Return type:

Series or DataFrame

Examples

>>> df = pd.DataFrame({'A': [1, 1, 2, 1, 2],
...                    'B': [np.nan, 2, 3, 4, 5],
...                    'C': [1, 2, 1, 1, 2]}, columns=['A', 'B', 'C'])

Groupby one column and return the mean of the remaining columns in each group.

>>> df.groupby('A').mean()      
     B         C
A
1  3.0  1.333333
2  4.0  1.500000

Groupby two columns and return the mean of the remaining column.

>>> df.groupby(['A', 'B']).mean()   
         C
A B
1 2.0  2.0
  4.0  1.0
2 3.0  1.0
  5.0  2.0

Groupby one column and return the mean of only one particular column in the group.

>>> df.groupby('A')['B'].mean()
A
1    3.0
2    4.0
Name: B, dtype: float64