snowflake.snowpark.RelationalGroupedDataFrame.apply_in_pandas¶
- RelationalGroupedDataFrame.apply_in_pandas(func: Callable, output_schema: StructType, **kwargs) DataFrame[source]¶
Maps each grouped dataframe in to a pandas.DataFrame, applies the given function on data of each grouped dataframe, and returns a pandas.DataFrame. Internally, a vectorized UDTF with input
funcargument as theend_partitionis registered and called. Additionalkwargsare accepted to specify arguments to register the UDTF. Group by clause used must be column reference, not a general expression.Requires
pandasto be installed in the execution environment and declared as a dependency by either specifying the keyword argument packages=[“pandas] in this call or callingadd_packages()beforehand.- Parameters:
func – A Python native function that accepts a single input argument - a
pandas.DataFrameobject and returns apandas.Dataframe. It is used as input toend_partitionin a vectorized UDTF.output_schema – A
StructTypeinstance that represents the table function’s output columns.input_names – A list of strings that represents the table function’s input column names. Optional, if unspecified, default column names will be ARG1, ARG2, etc.
kwargs – Additional arguments to register the vectorized UDTF. See
register()for all options.
- Examples::
Call
apply_in_pandasusing temporary UDTF:Call
apply_in_pandasusing permanent UDTF with replacing original UDTF:
See also