User-Defined Table FunctionsΒΆ
User-defined table functions (UDTFs) in Snowpark. Please see Python UDTF for details. There is also vectorized UDTF. Compared to the default row-by-row processing pattern of a normal UDTF, which sometimes is inefficient, vectorized Python UDTFs (user-defined table functions) enable seamless partition-by-partition processing by operating on partitions as pandas DataFrames and returning results as pandas DataFrames or lists of pandas arrays or pandas Series.
In addition, vectorized Python UDTFs allow for easy integration with libraries that operate on pandas DataFrames or pandas arrays.
- A vectorized UDTF handler class:
defines an
end_partition
method that takes in a DataFrame argument and returns apandas.DataFrame
or a tuple ofpandas.Series
orpandas.arrays
where each array is a column.does NOT define a
process
method.optionally defines a handler class with an
__init__
method which will be invoked before processing each partition.
Note
A vectorized UDTF must be called with partition_by()
to build the partitions.
Refer to UDTFRegistration
for details and sample code on how to create regular and vectorized UDTFs using Snowpark Python API.
Classes
|
Provides methods to register classes as UDTFs in the Snowflake database. |
|
Encapsulates a user defined table function that is returned by |
Methods
|
Registers a Python class as a Snowflake Python UDTF and returns the UDTF. |
|
Registers a Python class as a Snowflake Python UDTF from a Python or zip file, and returns the UDTF. |
Attributes
The Python class or a tuple containing the Python file path and the function name. |
|
The UDTF name. |