You are viewing documentation about an older version (1.27.0). View latest version


DataFrame.withColumn(col_name: str, col: Union[Column, TableFunctionCall], *, keep_column_order: bool = False, ast_stmt: Expr = None) → DataFrame[source]¶

Returns a DataFrame with an additional column with the specified name col_name. The column is computed by using the specified expression col.

If a column with the same name already exists in the DataFrame, that column is replaced by the new column.

Example 1:

>>> df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"])
>>> df.with_column("mean", (df["a"] + df["b"]) / 2).show()
|"A"  |"B"  |"MEAN"    |
|1    |2    |1.500000  |
|3    |4    |3.500000  |

Example 2:

>>> from snowflake.snowpark.functions import udtf
>>> @udtf(output_schema=["number"])
... class sum_udtf:
...     def process(self, a: int, b: int) -> Iterable[Tuple[int]]:
...         yield (a + b, )
>>> df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"])
>>> df.with_column("total", sum_udtf(df.a, df.b)).sort(df.a).show()
|"A"  |"B"  |"TOTAL"  |
|1    |2    |3        |
|3    |4    |7        |
  • col_name – The name of the column to add or replace.

  • col – The Column or table_function.TableFunctionCall with single column output to add or replace.

  • keep_column_order – If True, the original order of the columns in the DataFrame is preserved when reaplacing a column.