You are viewing documentation about an older version (1.4.0). View latest version

snowflake.snowpark.Column.within_group¶

Column.within_group(*cols: ColumnOrName | Iterable[ColumnOrName]) → Column[source]¶

Returns a Column expression that adds a WITHIN GROUP clause to sort the rows by the specified columns.

This method is supported on Column expressions returned by some of the aggregate functions, including functions.array_agg(), functions.listagg(), PERCENTILE_CONT(), and PERCENTILE_DISC(). For details, see the Snowflake documentation for the aggregate function that you are using (e.g. ARRAY_AGG).

Examples:

>>> from snowflake.snowpark.functions import array_agg, col
>>> from snowflake.snowpark import Window

>>> df = session.create_dataframe([(3, "v1"), (1, "v3"), (2, "v2")], schema=["a", "b"])
>>> # create a DataFrame containing the values in "a" sorted by "b"
>>> df.select(array_agg("a").within_group("b").alias("new_column")).show()
----------------
|"NEW_COLUMN"  |
----------------
|[             |
|  3,          |
|  2,          |
|  1           |
|]             |
----------------

>>> # create a DataFrame containing the values in "a" grouped by "b"
>>> # and sorted by "a" in descending order.
>>> df_array_agg_window = df.select(array_agg("a").within_group(col("a").desc()).over(Window.partitionBy(col("b"))).alias("new_column"))
>>> df_array_agg_window.show()
----------------
|"NEW_COLUMN"  |
----------------
|[             |
|  3           |
|]             |
|[             |
|  1           |
|]             |
|[             |
|  2           |
|]             |
----------------
Copy