You are viewing documentation about an older version (1.15.0). View latest version

snowflake.snowpark.DataFrame.pivot¶

DataFrame.pivot(pivot_col: Union[Column, str], values: Optional[Union[Iterable[Union[None, bool, int, float, str, bytearray, Decimal, date, datetime, time, bytes, list, tuple, dict]], DataFrame]] = None, default_on_null: Union[None, bool, int, float, str, bytearray, Decimal, date, datetime, time, bytes, list, tuple, dict] = None) → RelationalGroupedDataFrame[source]¶

Rotates this DataFrame by turning the unique values from one column in the input expression into multiple columns and aggregating results where required on any remaining column values.

Only one aggregate is supported with pivot.

Example:

>>> create_result = session.sql('''create or replace temp table monthly_sales(empid int, amount int, month text)
... as select * from values
... (1, 10000, 'JAN'),
... (1, 400, 'JAN'),
... (2, 4500, 'JAN'),
... (2, 35000, 'JAN'),
... (1, 5000, 'FEB'),
... (1, 3000, 'FEB'),
... (2, 200, 'FEB') ''').collect()
>>> df = session.table("monthly_sales")
>>> df.pivot("month", ['JAN', 'FEB']).sum("amount").sort(df["empid"]).show()
-------------------------------
|"EMPID"  |"'JAN'"  |"'FEB'"  |
-------------------------------
|1        |10400    |8000     |
|2        |39500    |200      |
-------------------------------


>>> df = session.table("monthly_sales")
>>> df.pivot("month").sum("amount").sort("empid").show()
-------------------------------
|"EMPID"  |"'FEB'"  |"'JAN'"  |
-------------------------------
|1        |8000     |10400    |
|2        |200      |39500    |
-------------------------------


>>> subquery_df = session.table("monthly_sales").select(col("month")).filter(col("month") == "JAN")
>>> df = session.table("monthly_sales")
>>> df.pivot("month", values=subquery_df).sum("amount").sort("empid").show()
---------------------
|"EMPID"  |"'JAN'"  |
---------------------
|1        |10400    |
|2        |39500    |
---------------------
Copy
Parameters:
  • pivot_col – The column or name of the column to use.

  • values – A list of values in the column, or dynamic based on the DataFrame query, or None (default) will use all values of the pivot column.

  • default_on_null – Expression to replace empty result values.