You are viewing documentation about an older version (1.3.0). View latest version

snowflake.snowpark.DataFrame.withColumn¶

DataFrame.withColumn(col_name: str, col: Column | TableFunctionCall) → DataFrame[source]¶

Returns a DataFrame with an additional column with the specified name col_name. The column is computed by using the specified expression col.

If a column with the same name already exists in the DataFrame, that column is replaced by the new column.

Example 1:

>>> df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"])
>>> df.with_column("mean", (df["a"] + df["b"]) / 2).show()
------------------------
|"A"  |"B"  |"MEAN"    |
------------------------
|1    |2    |1.500000  |
|3    |4    |3.500000  |
------------------------
Copy

Example 2:

>>> from snowflake.snowpark.functions import udtf
>>> @udtf(output_schema=["number"])
... class sum_udtf:
...     def process(self, a: int, b: int) -> Iterable[Tuple[int]]:
...         yield (a + b, )
>>> df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"])
>>> df.with_column("total", sum_udtf(df.a, df.b)).sort(df.a).show()
-----------------------
|"A"  |"B"  |"TOTAL"  |
-----------------------
|1    |2    |3        |
|3    |4    |7        |
-----------------------
Copy
Parameters: