modin.pandas.SeriesGroupBy.std¶
- SeriesGroupBy.std(ddof=1, engine=None, engine_kwargs=None, numeric_only=False)[source]¶
- Compute standard deviation of groups, excluding missing values. - For multiple groupings, the result index will be a MultiIndex. - Parameters:
- ddof (int, default 1.) – - Degrees of freedom. - Snowpark pandas currently only supports ddof=0 and ddof=1. 
- engine (str, default None) – - In pandas, engine can be configured as - 'cython'or- 'numba', and- Nonedefaults to- 'cython'or globally setting- compute.use_numba.- This parameter is ignored in Snowpark pandas, as the execution is always performed in Snowflake. 
- engine_kwargs (dict, default None) – - Configuration keywords for the configured execution egine. - This parameter is ignored in Snowpark pandas, as the execution is always performed in Snowflake. 
- numeric_only (bool, default False) – Include only float, int or boolean data columns. 
 
- Returns:
- Standard deviation of values within each group. 
- Return type:
 - Examples - For SeriesGroupBy: - >>> lst = ['a', 'a', 'a', 'b', 'b', 'b', 'c'] >>> ser = pd.Series([7, 2, 8, 4, 3, 3, 1], index=lst) >>> ser a 7 a 2 a 8 b 4 b 3 b 3 c 1 dtype: int64 >>> ser.groupby(level=0).std() a 3.21455 b 0.57735 c NaN dtype: float64 >>> ser.groupby(level=0).std(ddof=0) a 2.624669 b 0.471404 c 0.000000 dtype: float64 - Note that if the number of elements in a group is less or equal to the ddof, the result for the group will be NaN/None. For example, the value for group c is NaN when we call ser.groupby(level=0).std(), and the default ddof is 1. - For DataFrameGroupBy: - >>> data = {'a': [1, 3, 5, 7, 7, 8, 3], 'b': [1, 4, 8, 4, 4, 2, 1]} >>> df = pd.DataFrame(data, index=pd.Index(['dog', 'dog', 'dog', ... 'mouse', 'mouse', 'mouse', 'mouse'], name='c')) >>> df a b c dog 1 1 dog 3 4 dog 5 8 mouse 7 4 mouse 7 4 mouse 8 2 mouse 3 1 >>> df.groupby('c').std() a b c dog 2.000000 3.511885 mouse 2.217356 1.500000 >>> data = {'a': [1, 3, 5, 7, 7, 8, 3], 'b': ['c', 'e', 'd', 'a', 'a', 'b', 'e']} >>> df = pd.DataFrame(data, index=pd.Index(['dog', 'dog', 'dog', ... 'mouse', 'mouse', 'mouse', 'mouse'], name='c')) >>> df a b c dog 1 c dog 3 e dog 5 d mouse 7 a mouse 7 a mouse 8 b mouse 3 e >>> df.groupby('c').std(numeric_only=True) a c dog 2.000000 mouse 2.217356