java.lang.Object
- com.snowflake.snowpark_java.UDTFRegistration

public class UDTFRegistration
extends Object

Provides methods to register a UDTF (user-defined table function) in the Snowflake database. Session.udtf() returns an object of this class.

To register an UDTF, you must:

Define a UDTF class.
Create an instance of that class, and register that instance as a UDTF.

The next sections describe these steps in more detail.

Defining the UDTF Class

Define a class that implements one of the JavaUDTF[N] interfaces (e.g. JavaUDTF0, JavaUDTF1, etc.), where "n" specifies the number of input arguments for your UDTF. For example, if your UDTF passes in 3 input arguments, implements the JavaUDTF3 interface.

In your class, implement the following four methods:

process() , which is called once for each row in the input partition.
endPartition(), which is called once for each partition after all rows have been passed to process().
outputSchema(), which returns a StructType object that describes the schema for the returned rows.
inputSchema(), which returns a StructType object that describes the types of the input parameters. If your process() method passes in Map arguments, you must implement the inputSchema() method. If the method does not pass in Map arguments, implementing inputSchema() is optional.

When a UDTF is called, the rows are grouped into partitions before they are passed to the UDTF:

If the statement that calls the UDTF specifies the PARTITION clause (explicit partitions), that clause determines how the rows are partitioned.
If the statement does not specify the PARTITION clause (implicit partitions), Snowflake determines how best to partition the rows.

For an explanation of partitions, see Table Functions and Partitions

Defining the process() Method

This method is invoked once for each row in the input partition.

The arguments passed to the registered UDTF are passed to process(). For each argument passed to the UDTF, you must have a corresponding argument in the signature of the process() method. Make sure that the type of the argument in the process() method matches the Snowflake data type of the corresponding argument in the UDTF.

Snowflake supports the following data types for the parameters for a UDTF:

Supported Snowflake Data Types and Corresponding Java Types
SQL Type	Java Type	Notes
NUMBER	java.lang.Short	Supported
NUMBER	java.lang.Integer	Supported
NUMBER	java.lang.Long	Supported
FLOAT	java.lang.Float	Supported
DOUBLE	java.lang.Double	Supported
NUMBER	java.math.BigDecimal	Supported
VARCHAR	String	Supported
BOOL	java.lang.Boolean	Supported
DATE	java.sql.Date	Supported
TIMESTAMP	java.sql.Timestamp	Supported
BINARY	byte[]	Supported
ARRAY	String[] or Variant[]	Supported String[] or Variant[]
OBJECT	Map<String, String> or Map<String, Variant>	Supported Map<String, String> or Map<String, Variant>
VARIANT	`Variant`	Supported

Defining the endPartition() Method

This method is invoked once for each partition, after all rows in that partition have been passed to the process() method.

You can use this method to generate output rows, based on any state information that you aggregate in the process() method.

Defining the outputSchema() Method

In this method, define the output schema for the rows returned by the process() and endPartition() methods.

Construct and return a StructType object that uses an Array of StructField objects to specify the Snowflake data type of each field in a returned row.

Snowflake supports the following DataTypes for the input types:

Supported Snowflake Data Types and Corresponding Java Types
Snowpark `DataTypes`	Java Type	Notes
ShortType	java.lang.Short	Supported
IntegerType	java.lang.Integer	Supported
LongType	java.lang.Long	Supported
FloatType	java.lang.Float	Supported
DoubleType	java.lang.Double	Supported
DecimalType	java.math.BigDecimal	Supported
StringType	String	Supported
BooleanType	java.lang.Boolean	Supported
DateType	java.sql.Date	Supported
TimestampType	java.sql.Timestamp	Supported
BinaryType	byte[]	Supported
ArrayType(StringType) or ArrayType(Variant)	String[] or Variant[]	Supported String[] or Variant[]
MapType(StringTyoe, StringType) or MapType(StringTyoe, VariantType)	Map<String, String> or Map<String, Variant>	Supported Map<String, String> or Map<String, Variant>
VariantType	`Variant`	Supported

Defining the inputSchema() Method

In this method, define the input schema for the input arguments for process(). Snowpark can infer the input schema if Map is not used as process().

Construct and return a StructType object that uses an Array of StructField objects to specify the data type of each parameter for process().

Snowflake supports the following DataTypes for input types

Supported Java Types and Corresponding Snowflake Data Types
Java Type	`DataTypes`	Notes
java.lang.Boolean	BooleanType	Supported
java.lang.Short	ShortType	Supported
java.lang.Integer	IntegerType	Supported
java.lang.Long	LongType	Supported
java.lang.Float	FloatType	Supported
java.lang.Double	DoubleType	Supported
java.math.BigDecimal	DecimalType	Supported
String	StringType	Supported
java.sql.Date	DateType	Supported
java.sql.Timestamp	TimestampType	Supported
byte[]	BinaryType	Supported
String[]	ArrayType(StringType)	Supported
Variant[]	ArrayType(Variant)	Supported
Map<String, String>	MapType(StringType, StringType)	Supported
Map<String, Variant>	MapType(StringType, VariantType)	Supported
com.snowflake.snowpark_java.types.Variant	VariantType	Supported

Example of a UDTF Class

The following is an example of a UDTF class that generates a range of rows.

The UDTF passes in 2 arguments, so the class implements JavaUDTF2.

The arguments start and count specify the starting number for the row and the number of rows to generate.


 import java.util.stream.Stream;
 import com.snowflake.snowpark_java.types.*;
 import com.snowflake.snowpark_java.udtf.*;

 class MyRangeUdtf implements JavaUDTF2<Integer, Integer> {
   public StructType outputSchema() {
     return StructType.create(new StructField("C1", DataTypes.IntegerType));
   }

   // Because the process() method in this example does not pass in Map arguments,
   // implementing the inputSchema() method is optional.
   public StructType inputSchema() {
     return StructType.create(
             new StructField("start_value", DataTypes.IntegerType),
             new StructField("value_count", DataTypes.IntegerType));
   }

   public Stream<Row> endPartition() {
     return Stream.empty();
   }

   public Stream<Row> process(Integer start, Integer count) {
     Stream.Builder<Row> builder = Stream.builder();
     for (int i = start; i < start + count ; i++) {
       builder.add(Row.create(i));
     }
     return builder.build();
   }
 }

Registering and Calling the UDTF

Next, create an instance of the new class, and register the class by calling one of the UDTFRegistration methods. You can register a temporary or permanent UDTF by name. If you don't need to call the UDTF by name, you can register an anonymous UDTF.

Registering a Temporary UDTF By Name

To register a temporary UDTF by name, call registerTemporary, passing in a name for the UDTF and an instance of the UDTF class. For example:


 // Use the MyRangeUdtf defined in previous example.
 TableFunction tableFunction = session.udtf().registerTemporary("myUdtf", new MyRangeUdtf());
 session.tableFunction(tableFunction, Functions.lit(10), Functions.lit(5)).show();

Registering a Permanent UDTF By Name

If you need to use the UDTF in subsequent sessions, register a permanent UDTF by calling


 registerPermanent()

.

When registering a permanent UDTF, you must specify a stage where the registration method will upload the JAR files for the UDTF and its dependencies. For example:


 // Use the MyRangeUdtf defined in previous example.
 TableFunction tableFunction = session.udtf().registerPermanent("myUdtf", new MyRangeUdtf(), "@myStage");
 session.tableFunction(tableFunction, Functions.lit(10), Functions.lit(5)).show();

Registering an Anonymous Temporary UDTF

If you do not need to refer to a UDTF by name, use registerTemporary(JavaUDTF udtf) to create an anonymous UDTF instead. For example:


 // Use the MyRangeUdtf defined in previous example.
 TableFunction tableFunction = session.udtf().registerTemporary(new MyRangeUdtf());
 session.tableFunction(tableFunction, Functions.lit(10), Functions.lit(5)).show();

Since:: 1.4.0

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`TableFunction`	`registerPermanent(String funcName, JavaUDTF udtf, String stageLocation)`	Registers an UDTF instance as a Snowflake UDTF.
`TableFunction`	`registerTemporary(JavaUDTF udtf)`	Registers an UDTF instance as a temporary anonymous UDTF that is scoped to this session.
`TableFunction`	`registerTemporary(String funcName, JavaUDTF udtf)`	Registers an UDTF instance as a temporary Snowflake UDTF that you plan to use in the session.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Method Detail
  - registerTemporary
```
public TableFunction registerTemporary(JavaUDTF udtf)
```
    Registers an UDTF instance as a temporary anonymous UDTF that is scoped to this session.
    
    Parameters:
    
    udtf - The UDTF instance to be registered
    
    Returns:
    
    A TableFunction that represents the corresponding FUNCTION created in Snowflake
    
    Since:
    
    1.4.0
  - registerTemporary
```
public TableFunction registerTemporary(String funcName,
                                       JavaUDTF udtf)
```
    Registers an UDTF instance as a temporary Snowflake UDTF that you plan to use in the session.
    
    Parameters:
    
    funcName - The name that you want to use to refer to the UDTF.
    
    udtf - The UDTF instance to be registered
    
    Returns:
    
    A TableFunction that represents the corresponding FUNCTION created in Snowflake
    
    Since:
    
    1.4.0
  - registerPermanent
```
public TableFunction registerPermanent(String funcName,
                                       JavaUDTF udtf,
                                       String stageLocation)
```
    Registers an UDTF instance as a Snowflake UDTF.
    The function uploads the JAR files that the UDTF depends upon to the specified stage. Each JAR file is uploaded to a subdirectory named after the MD5 checksum for the file.
    If you register multiple UDTFs and specify the same stage location, any dependent JAR files used by those functions will only be uploaded once. The JAR file for the UDTF code itself will be uploaded to a subdirectory named after the UDTF.
    
    Parameters:
    
    funcName - The name that you want to use to refer to the UDTF.
    
    udtf - The UDTF instance to be registered.
    
    stageLocation - Stage location where the JAR files for the UDTF and its dependencies should be uploaded
    
    Returns:
    
    A TableFunction that represents the corresponding FUNCTION created in Snowflake
    
    Since:
    
    1.4.0

Class UDTFRegistration