You are viewing documentation about an older version (1.4.0). View latest version

snowflake.snowpark.functions.array_distinct

snowflake.snowpark.functions.array_distinct(col: ColumnOrName)[source]

The function excludes any duplicate elements that are present in the input ARRAY. The function is not guaranteed to return the elements in the ARRAY in a specific order. The function is NULL safe, which means that it treats NULLs as known values when identifying duplicate elements.

Parameters:

col – The array column

Returns:

Returns a new ARRAY that contains only the distinct elements from the input ARRAY.

Example:

>>> from snowflake.snowpark.functions import array_construct,array_distinct,lit
>>> df = session.createDataFrame([["1"]], ["A"])
>>> df = df.withColumn("array", array_construct(lit(1), lit(1), lit(1), lit(2), lit(3), lit(2), lit(2)))
>>> df.withColumn("array_d", array_distinct("ARRAY")).show()
-----------------------------
|"A"  |"ARRAY"  |"ARRAY_D"  |
-----------------------------
|1    |[        |[          |
|     |  1,     |  1,       |
|     |  1,     |  2,       |
|     |  1,     |  3        |
|     |  2,     |]          |
|     |  3,     |           |
|     |  2,     |           |
|     |  2      |           |
|     |]        |           |
-----------------------------
Copy