You are viewing documentation about an older version (1.12.1). View latest version

snowflake.snowpark.DataFrame.select

DataFrame.select(*cols: Union[Column, str, TableFunctionCall, Iterable[Union[Column, str, TableFunctionCall]]]) DataFrame[source]

Returns a new DataFrame with the specified Column expressions as output (similar to SELECT in SQL). Only the Columns specified as arguments will be present in the resulting DataFrame.

You can use any Column expression or strings for named columns.

Example 1::
>>> df = session.create_dataframe([[1, "some string value", 3, 4]], schema=["col1", "col2", "col3", "col4"])
>>> df_selected = df.select(col("col1"), col("col2").substr(0, 10), df["col3"] + df["col4"])
Copy

Example 2:

>>> df_selected = df.select("col1", "col2", "col3")
Copy

Example 3:

>>> df_selected = df.select(["col1", "col2", "col3"])
Copy

Example 4:

>>> df_selected = df.select(df["col1"], df.col2, df.col("col3"))
Copy

Example 5:

>>> from snowflake.snowpark.functions import table_function
>>> split_to_table = table_function("split_to_table")
>>> df.select(df.col1, split_to_table(df.col2, lit(" ")), df.col("col3")).show()
-----------------------------------------------
|"COL1"  |"SEQ"  |"INDEX"  |"VALUE"  |"COL3"  |
-----------------------------------------------
|1       |1      |1        |some     |3       |
|1       |1      |2        |string   |3       |
|1       |1      |3        |value    |3       |
-----------------------------------------------
Copy

Note

A TableFunctionCall can be added in select when the dataframe results from another join. This is possible because we know the hierarchy in which the joins are applied.

Parameters:

*cols – A Column, str, table_function.TableFunctionCall, or a list of those. Note that at most one table_function.TableFunctionCall object is supported within a select call.