You are viewing documentation about an older version (1.3.0). View latest version

snowflake.snowpark.DataFrame.agg¶

DataFrame.agg(*exprs: Column | Tuple[ColumnOrName, str] | Dict[str, str]) → DataFrame[source]¶

Aggregate the data in the DataFrame. Use this method if you don’t need to group the data (group_by()).

Parameters:

exprs –

A variable length arguments list where every element is

  • A Column object

  • A tuple where the first element is a column object or a column name and the second element is the name of the aggregate function

  • A list of the above

or a dict maps column names to aggregate function names.

Examples:

>>> from snowflake.snowpark.functions import col, stddev, stddev_pop

>>> df = session.create_dataframe([[1, 2], [3, 4], [1, 4]], schema=["A", "B"])
>>> df.agg(stddev(col("a"))).show()
----------------------
|"STDDEV(A)"         |
----------------------
|1.1547003940416753  |
----------------------


>>> df.agg(stddev(col("a")), stddev_pop(col("a"))).show()
-------------------------------------------
|"STDDEV(A)"         |"STDDEV_POP(A)"     |
-------------------------------------------
|1.1547003940416753  |0.9428091005076267  |
-------------------------------------------


>>> df.agg(("a", "min"), ("b", "max")).show()
-----------------------
|"MIN(A)"  |"MAX(B)"  |
-----------------------
|1         |4         |
-----------------------


>>> df.agg({"a": "count", "b": "sum"}).show()
-------------------------
|"COUNT(A)"  |"SUM(B)"  |
-------------------------
|3           |10        |
-------------------------
Copy

Note

The name of the aggregate function to compute must be a valid Snowflake aggregate function.