You are viewing documentation about an older version (1.12.1). View latest version

User-Defined Table Functions

User-defined table functions (UDTFs) in Snowpark. Please see Python UDTF for details. There is also vectorized UDTF. Compared to the default row-by-row processing pattern of a normal UDTF, which sometimes is inefficient, vectorized Python UDTFs (user-defined table functions) enable seamless partition-by-partition processing by operating on partitions as Pandas DataFrames and returning results as Pandas DataFrames or lists of Pandas arrays or Pandas Series.

In addition, vectorized Python UDTFs allow for easy integration with libraries that operate on pandas DataFrames or pandas arrays.

A vectorized UDTF handler class:
  • defines an end_partition method that takes in a DataFrame argument and returns a pandas.DataFrame or a tuple of pandas.Series or pandas.arrays where each array is a column.

  • does NOT define a process method.

  • optionally defines a handler class with an __init__ method which will be invoked before processing each partition.

Note

A vectorized UDTF must be called with partition_by() to build the partitions.

Refer to UDTFRegistration for details and sample code on how to create regular and vectorized UDTFs using Snowpark Python API.

Classes

UDTFRegistration(session)

Provides methods to register classes as UDTFs in the Snowflake database.

UserDefinedTableFunction(handler, ...)

Encapsulates a user defined table function that is returned by udtf(), UDTFRegistration.register() or UDTFRegistration.register_from_file().

Methods

register(handler, output_schema[, ...])

Registers a Python class as a Snowflake Python UDTF and returns the UDTF.

register_from_file(file_path, handler_name, ...)

Registers a Python class as a Snowflake Python UDTF from a Python or zip file, and returns the UDTF.

Attributes

handler

The Python class or a tuple containing the Python file path and the function name.

name

The UDTF name.