snowflake.snowpark.functions.width_bucket¶

snowflake.snowpark.functions.width_bucket(expr: Union[snowflake.snowpark.column.Column, str], min_value: Union[snowflake.snowpark.column.Column, str], max_value: Union[snowflake.snowpark.column.Column, str], num_buckets: Union[snowflake.snowpark.column.Column, str]) → Column[source]¶

Constructs equi-width histograms, in which the histogram range is divided into intervals that have identical sizes, returning the bucket number that the input expression would be assigned to.

Parameters:
  • expr (ColumnOrName) – The expression to evaluate and assign to a bucket.

  • min_value (ColumnOrName) – The minimum value of the histogram range.

  • max_value (ColumnOrName) – The maximum value of the histogram range.

  • num_buckets (ColumnOrName) – The number of buckets to create.

Returns:

The bucket number (1-based) that the input expression falls into.

Return type:

Column

Examples::
>>> df = session.create_dataframe([
...     [290000.00],
...     [320000.00],
...     [399999.99],
...     [400000.00],
...     [470000.00],
...     [510000.00]
... ], schema=["price"])
>>> df.select(width_bucket(df["price"], lit(200000), lit(600000), lit(4)).alias("sales_group")).collect()
[Row(SALES_GROUP=1), Row(SALES_GROUP=2), Row(SALES_GROUP=2), Row(SALES_GROUP=3), Row(SALES_GROUP=3), Row(SALES_GROUP=4)]
>>> df = session.create_dataframe([[150000.00]], schema=["price"])
>>> df.select(width_bucket(df["price"], lit(200000), lit(600000), lit(4)).alias("sales_group")).collect()
[Row(SALES_GROUP=0)]
>>> df = session.create_dataframe([[700000.00]], schema=["price"])
>>> df.select(width_bucket(df["price"], lit(200000), lit(9600000), lit(4)).alias("sales_group")).collect()
[Row(SALES_GROUP=1)]
Copy