You are viewing documentation about an older version (1.4.0). View latest version

snowflake.snowpark.DataFrame.flatten¶

DataFrame.flatten(input: ColumnOrName, path: str | None = None, outer: bool = False, recursive: bool = False, mode: str = 'BOTH') → DataFrame[source]¶

Flattens (explodes) compound values into multiple rows.

It creates a new DataFrame from this DataFrame, carries the existing columns to the new DataFrame, and adds the following columns to it:

SEQ

KEY

PATH

INDEX

VALUE

THIS

References: Snowflake SQL function FLATTEN.

If this DataFrame also has columns with the names above, you can disambiguate the columns by renaming them.

Example:

>>> table1 = session.sql("select parse_json(numbers) as numbers from values('[1,2]') as T(numbers)")
>>> flattened = table1.flatten(table1["numbers"])
>>> flattened.select(table1["numbers"], flattened["value"].as_("flattened_number")).show()
----------------------------------
|"NUMBERS"  |"FLATTENED_NUMBER"  |
----------------------------------
|[          |1                   |
|  1,       |                    |
|  2        |                    |
|]          |                    |
|[          |2                   |
|  1,       |                    |
|  2        |                    |
|]          |                    |
----------------------------------

Parameters:

input – The name of a column or a Column instance that will be unseated into rows. The column data must be of Snowflake data type VARIANT, OBJECT, or ARRAY.
path – The path to the element within a VARIANT data structure which needs to be flattened. The outermost element is to be flattened if path is empty or None.
outer – If False, any input rows that cannot be expanded, either because they cannot be accessed in the path or because they have zero fields or entries, are completely omitted from the output. Otherwise, exactly one row is generated for zero-row expansions (with NULL in the KEY, INDEX, and VALUE columns).
recursive – If False, only the element referenced by path is expanded. Otherwise, the expansion is performed for all sub-elements recursively.
mode – Specifies which types should be flattened “OBJECT”, “ARRAY”, or “BOTH”.

Returns:

A new DataFrame that has the columns carried from this DataFrame, the flattened new columns and new rows.