You are viewing documentation about an older version (1.22.1). View latest version

GroupBy¶

All supported groupby APIs

Indexing, iteration

DataFrameGroupBy.__iter__()

GroupBy iterator.

SeriesGroupBy.__iter__()

GroupBy iterator.

DataFrameGroupBy.get_group(name[, obj])

DataFrameGroupBy.groups

Get a dictionary mapping group key to row labels.

SeriesGroupBy.groups

Get a dictionary mapping group key to row labels.

DataFrameGroupBy.indices

Get a dictionary mapping group key to row positions.

SeriesGroupBy.indices

Get a dictionary mapping group key to row positions.

Function application

DataFrameGroupBy.apply(func, *args, **kwargs)

Apply function func group-wise and combine the results together.

DataFrameGroupBy.agg([func, engine, ...])

Aggregate using one or more operations over the specified axis.

SeriesGroupBy.agg([func, engine, engine_kwargs])

Aggregate using one or more operations over the specified axis.

DataFrameGroupBy.aggregate([func, engine, ...])

Aggregate using one or more operations over the specified axis.

SeriesGroupBy.aggregate([func, engine, ...])

Aggregate using one or more operations over the specified axis.

DataFrameGroupBy.transform(func, *args[, ...])

Call function producing a same-indexed DataFrame on each group.

DataFrameGroupBy computations / descriptive stats

DataFrameGroupBy.all([skipna])

Return True if all values in the group are truthful, else False.

DataFrameGroupBy.any([skipna])

Return True if any value in the group is truthful, else False.

DataFrameGroupBy.count()

Compute count of group, excluding missing values.

DataFrameGroupBy.cumcount([ascending])

Number each item in each group from 0 to the length of that group - 1.

DataFrameGroupBy.cummax([axis, numeric_only])

Cumulative max for each group.

DataFrameGroupBy.cummin([axis, numeric_only])

Cumulative min for each group.

DataFrameGroupBy.cumsum([axis])

Cumulative sum for each group.

DataFrameGroupBy.first([numeric_only, ...])

DataFrameGroupBy.head([n])

Return first n rows of each group.

DataFrameGroupBy.idxmax([axis, skipna, ...])

Return the index of the first occurrence of maximum over requested axis.

DataFrameGroupBy.idxmin([axis, skipna, ...])

Return the index of the first occurrence of minimum over requested axis.

DataFrameGroupBy.last([numeric_only, ...])

DataFrameGroupBy.max([numeric_only, ...])

Compute max of group values.

DataFrameGroupBy.mean([numeric_only, ...])

Compute mean of groups, excluding missing values.

DataFrameGroupBy.median([numeric_only])

Compute median of groups, excluding missing values.

DataFrameGroupBy.min([numeric_only, ...])

Compute min of group values.

DataFrameGroupBy.nunique([dropna])

Return DataFrame with counts of unique elements in each position.

DataFrameGroupBy.quantile([q, interpolation])

Return group values at the given quantile, like numpy.percentile.

DataFrameGroupBy.rank([method, ascending, ...])

Provide the rank of values within each group.

DataFrameGroupBy.shift([periods, freq, ...])

Shift each group by periods observations.

DataFrameGroupBy.size()

Compute group sizes.

DataFrameGroupBy.std([ddof, engine, ...])

Compute standard deviation of groups, excluding missing values.

DataFrameGroupBy.sum([numeric_only, ...])

Compute sum of group values.

DataFrameGroupBy.tail([n])

Return last n rows of each group.

DataFrameGroupBy.value_counts([subset, ...])

Return a Series or DataFrame containing counts of unique rows.

DataFrameGroupBy.var([ddof, engine, ...])

Compute variance of groups, excluding missing values.

SeriesGroupBy computations / descriptive stats

SeriesGroupBy.all([skipna])

Return True if all values in the group are truthful, else False.

SeriesGroupBy.any([skipna])

Return True if any value in the group is truthful, else False.

SeriesGroupBy.count()

Compute count of group, excluding missing values.

SeriesGroupBy.cumcount([ascending])

Number each item in each group from 0 to the length of that group - 1.

SeriesGroupBy.cummax([axis, numeric_only])

Cumulative max for each group.

SeriesGroupBy.cummin([axis, numeric_only])

Cumulative min for each group.

SeriesGroupBy.cumsum([axis])

Cumulative sum for each group.

SeriesGroupBy.first([numeric_only, ...])

SeriesGroupBy.head([n])

Return first n rows of each group.

SeriesGroupBy.idxmax([axis, skipna, ...])

Return the index of the first occurrence of maximum over requested axis.

SeriesGroupBy.idxmin([axis, skipna, ...])

Return the index of the first occurrence of minimum over requested axis.

SeriesGroupBy.last([numeric_only, ...])

SeriesGroupBy.max([numeric_only, min_count, ...])

Compute max of group values.

SeriesGroupBy.mean([numeric_only, engine, ...])

Compute mean of groups, excluding missing values.

SeriesGroupBy.median([numeric_only])

Compute median of groups, excluding missing values.

SeriesGroupBy.min([numeric_only, min_count, ...])

Compute min of group values.

SeriesGroupBy.nunique([dropna])

Return DataFrame with counts of unique elements in each position.

SeriesGroupBy.quantile([q, interpolation])

Return group values at the given quantile, like numpy.percentile.

SeriesGroupBy.rank([method, ascending, ...])

Provide the rank of values within each group.

SeriesGroupBy.shift([periods, freq, axis, ...])

Shift each group by periods observations.

SeriesGroupBy.size()

Compute group sizes.

SeriesGroupBy.std([ddof, engine, ...])

Compute standard deviation of groups, excluding missing values.

SeriesGroupBy.sum([numeric_only, min_count, ...])

Compute sum of group values.

SeriesGroupBy.tail([n])

Return last n rows of each group.

SeriesGroupBy.value_counts([subset, ...])

Return a Series or DataFrame containing counts of unique rows.

SeriesGroupBy.var([ddof, engine, ...])

Compute variance of groups, excluding missing values.