Creating Java UDFs

This topic shows how to create and install a Java user-defined function (UDF).

When you write a Java UDF, you write Java code for Snowflake to execute as UDF logic. This Java code is the UDF’s handler. You deploy the UDF with CREATE FUNCTION, giving the UDF a name and specifying the Java method as the handler to use when the UDF is called.

In this Topic:

Writing the UDF Handler in Java

Use the following requirements and guidelines when writing your Java UDF handler.

  • Define the class as public.

  • Inside the class, declare at least one public method to use as a UDF handler.

    For an inline UDF, declare one handler method only. If instead you intend to package the class into a JAR as a pre-compiled UDF, you can declare multiple handler methods, later specifying each as a handler with the HANDLER clause of a CREATE FUNCTION statement.

    You can declare other methods, if needed, to be called by the handler method.

    Use the following requirements and guidelines for each handler method:

    • Declare the handler method as public, either static or non-static.

      If the method is non-static, your class must also declare a zero-argument constructor or no constructor at all.

      Snowflake does not pass any arguments to the constructor when it instantiates the class. If the constructor throws an error, the error is thrown as a user error, along with the exception message.

    • Specify an appropriate return type.

      The return type must be one of the data types specified in the Java Data Type column of the SQL-Java Type Mappings table. The return type must be compatible with the SQL data type specified in the RETURNS clause of the CREATE FUNCTION statement.

    • Ensure that each handler method argument (if any) is a data type specified in the Java Data Type column of the SQL-Java Type Mappings table.

      When choosing data types of Java variables, take into account the maximum and minimum possible values of the data that could be sent from (and returned to) Snowflake.

    • Comply with the Snowflake-imposed constraints for Java UDFs in each handler method and methods it calls.

Creating the Function in Snowflake

The information in this section applies to all Java UDFs, regardless of whether the code is specified in-line or pre-compiled.

You must execute a CREATE FUNCTION statement to specify aspects of the UDF, including:

  • The UDF name.

  • The name of the Java method for Snowflake to call as a handler when the UDF is called.

The UDF name does not need to match the name of the handler method written in Java. The CREATE FUNCTION statement associates the UDF name with the Java method, as shown in the following diagram:

Using the CREATE FUNCTION Statement to Associate the Handler Method With the UDF Name

When choosing a name for the UDF:

  • Follow the rules for Object Identifiers.

  • Choose a name that is unique, or follow the rules for Overloading of UDF Names.

    Important

    Unlike overloading for SQL UDFs, which distinguishes among functions based on both the number and the data types of the arguments, Java UDFs distinguish among methods based only on the number of arguments. If two Java methods have the same name and the same number of arguments, but different data types, then calling a UDF using one of those methods as a handler generates an error similar to the following:

    Cannot determine which implementation of handler “handler name” to invoke since there are multiple definitions with <number of args> arguments in function <user defined function name> with handler <class name>.<handler name>

    If a warehouse is available, the error is detected at the time that the UDF is created. Otherwise, the error occurs when the UDF is called.

    Resolving based on data types is impractical because some SQL data types can be mapped to more than one Java data type and thus potentially to more than one Java UDF signature.

Handler method arguments are bound to UDF arguments by position, not name. In other words, the first UDF argument is passed to the first method argument, the second UDF argument is passed to the second method argument, and so on.

In the examples below, the SQL UDF argument x corresponds to the Java method argument a, and y corresponds to b:

create function add_int_float(x numeric(9, 0), y float)
returns float
language java
handler = `MyClass.addIntFloat`;
public static float addIntFloat(int a, float b) {
  // ...
}

For information about the data types of arguments, see SQL-Java Data Type Mappings for Parameters and Return Types.

Note

Scalar functions (UDFs) have a limit of 500 input arguments.

It is recommended that you use the RUNTIME_VERSION parameter of the the CREATE FUNCTION statement to specify which supported Java runtime version to use because the default version may change in the future.

In-line UDFs vs. Pre-compiled UDFs

You can define a Java UDF handler in either of the following ways:

  • Pre-compiled, in which you package the compiled Java handler in a JAR and put it where Snowflake can read it.

    The CREATE FUNCTION statement specifies the location of the JAR file containing the handler. Before executing CREATE FUNCTION, you copy the JAR file to a stage from which Snowflake can read it.

  • In-line, in which you include the Java code with the UDF declaration itself.

    The CREATE FUNCTION statement itself specifies the Java source code. Snowflake compiles the source code and stores the compiled code in a JAR file. You can specify a location for the resulting JAR file with the TARGET_PATH clause when you run CREATE FUNCTION.

    • If CREATE FUNCTION specifies a location for the JAR file, then Snowflake compiles the code once and keeps the JAR file for future use.

    • If CREATE FUNCTION does not specify a location for the JAR file, then Snowflake re-compiles the code for each SQL statement that calls the UDF, and Snowflake automatically cleans up the JAR file after the SQL statement finishes.

Practical Differences

In-line Java UDF Advantages

  • They are usually easier to implement. After using your development tools to verify that your code works as it should, you can deploy it by copying it into the CREATE FUNCTION statement, then executing the statement. You can maintain the code there, updating and executing CREATE FUNCTION, without having to separately re-package the compiled output in a JAR and update it on a stage.

Pre-compiled Java UDF Advantages

  • You can use them when you have a JAR file but no source code.

  • You can use them if the source code is too large to paste into a CREATE FUNCTION statement. (In-line Java UDFs have an upper limit on the source code size.)

  • A pre-compiled Java UDF can contain multiple handler functions. Multiple CREATE FUNCTION statements can reference the same JAR file but specify different handler functions within the JAR file.

    In-line Java UDFs normally contain only one callable function. (That callable function can call other functions, and those other functions can be defined in the same class or can be defined in other classes defined in library JAR files.)

  • If you have tools or an environment for testing or debugging JAR files, it might be more convenient to do most of the development work on your UDF using JAR files. This is particularly true if the code is large or complex.

Creating an In-line Java UDF

For an in-line UDF, you supply the Java source code as part of the CREATE FUNCTION statement.

You put the Java source code in the AS clause, surrounding the code with single quotes or a pair of dollar signs ($$). Using the double dollar signs might be easier, such as when the source code contains embedded single quotes.

Code in the following example declares an add UDF whose handler is the add method in the TestAddFunc class.

create function add(x integer, y integer)
returns integer
language java
handler='TestAddFunc.add'
target_path='@~/TestAddFunc.jar'
as
$$
    class TestAddFunc {
        public static int add(int x, int y) {
          return x + y;
        }
    }
$$;

Given that the Java source code can contain more than one class and more than one method in a class, the HANDLER clause specifies the class and method to use as the handler.

An in-line Java UDF (like a pre-compiled Java UDF) can call code in JAR files that are included in the CREATE FUNCTION statement’s IMPORTS clause.

For more details about the syntax of the CREATE FUNCTION statement, see CREATE FUNCTION.

For more examples, see in-line Java UDF examples.

Creating a Pre-compiled Java UDF

When you intend to create a UDF that specifies the location of an existing JAR for its handler, you develop the handler by:

Organizing Your Files

If you plan to compile the Java code to create the JAR file yourself, you can organize the files as shown below. This example assumes that you plan to use Java’s package mechanism.

  • developmentDirectory

    • packageDirectory

      • class_file1.java

      • class_file2.java

    • classDirectory

      • class_file1.class

      • class_file2.class

    • manifest_file.manifest (optional)

    • jar_file.jar

    • put_command.sql

developmentDirectory

This directory contains the project-specific files required to create your Java UDF.

packageDirectory

This directory contains the .java files to compile and include in the package.

class_file#.java

These files contain the Java source code of the UDF.

class_file#.class

These are the .class file(s) created by compiling the .java files.

manifest_file.manifest

The optional manifest file used when combining the .class files (and optionally, dependency JAR files) into the JAR file.

jar_file.jar

The JAR file that contains the UDF code.

put_command.sql

This file contains the SQL PUT command to copy the JAR file to a Snowflake stage.

Compiling the Java Code and Creating the JAR File

To create a JAR file that contains the compiled Java code:

  • Use javac to compile your .java file to a .class file.

    If you use a compiler newer than version 11.x, you can use the “–release” option to specify that the target version is version 11.

  • Put your .class file into a JAR file. You can package multiple class files (and other JAR files) into your JAR file.

    For example:

    jar cf ./my_udf.jar MyClass.class
    

    A manifest file is required if your handler class is in a package, and optional otherwise. The following example uses a manifest file:

    jar cmf my_udf.manifest ./my_udf.jar example/MyClass.class
    

    To build the jar file with all dependencies included, you can use Maven’s mvn package command with the maven-assembly-plugin. For more information about the maven-assembly-plugin, see the Maven usage page.

    Snowflake automatically supplies the standard Java libraries (e.g. java.util). If your code calls those libraries, you do not need to include them in your JAR file.

    The methods that you call in libraries must follow the same Snowflake-imposed constraints as your Java method.

Copying the JAR File to Your Stage

In order for Snowflake to read from the JAR containing your handler method, you need to copy the JAR to one of the following kinds of stage:

  • A user or named internal stage.

    Snowflake does not currently support using a table stage to store a JAR file with UDF handlers. For more on internal stages, see Choosing an Internal Stage for Local Files.

  • An external stage.

The stage hosting the JAR file must be readable by the owner of the UDF.

Typically, you upload the JAR to a named internal stage using the PUT command. Note that you can’t execute the PUT command through the Snowflake GUI; you can use SnowSQL to execute PUT. See the Examples section for an example PUT command to copy a .jar file to a stage.

For more about creating stages, see CREATE STAGE.

Caveats and Best Practices

If you delete or rename the JAR file, you can no longer call the UDF.

If you need to update your JAR file, then:

  • Update it while no calls to the UDF can be made.

  • If the old .jar file is still in the stage, the PUT command should include the clause OVERWRITE=TRUE.

Granting Privileges on the Function

For any role other than the owner of the function to call the function, the owner must grant the appropriate privileges to the role.

The GRANT statements for a Java UDF are essentially identical to the GRANT statements for other UDFs, such as JavaScript UDFs.

For example:

GRANT USAGE ON FUNCTION my_java_udf(number, number) TO my_role;

Examples

In-Line Java UDFs

Creating and Calling a Simple In-line Java UDF

The following statements create and call an in-line Java UDF. This code returns the VARCHAR passed to it.

This function is declared with the optional CALLED ON NULL INPUT clause to indicate that the function is called even if the value of the input is NULL. (This function would return NULL with or without this clause, but you could modify the code to handle NULL another way, for example, to return an empty string.)

Create the UDF:

create or replace function echo_varchar(x varchar)
returns varchar
language java
called on null input
handler='TestFunc.echoVarchar'
target_path='@~/testfunc.jar'
as
'class TestFunc {
  public static String echoVarchar(String x) {
    return x;
  }
}';

Call the UDF:

SELECT echo_varchar('Hello');
+-----------------------+
| ECHO_VARCHAR('HELLO') |
|-----------------------|
| Hello                 |
+-----------------------+

Passing a NULL to an In-line Java UDF

This uses the echo_varchar() UDF defined above. The SQL NULL value is implicitly converted to Java null, and that Java null is returned and implicitly converted back to SQL NULL:

Call the UDF:

SELECT echo_varchar(NULL);
+--------------------+
| ECHO_VARCHAR(NULL) |
|--------------------|
| NULL               |
+--------------------+

Returning NULL Explicitly

The following code shows how to return a NULL value explicitly. The Java value null is converted to SQL NULL.

Create the UDF:

create or replace function return_a_null()
returns varchar
null
language java
handler='TemporaryTestLibrary.returnNull'
target_path='@~/TemporaryTestLibrary.jar'
as
$$
class TemporaryTestLibrary {
  public static String returnNull() {
    return null;
  }
}
$$;

Call the UDF:

SELECT return_a_null();
+-----------------+
| RETURN_A_NULL() |
|-----------------|
| NULL            |
+-----------------+

Passing an OBJECT to an In-line Java UDF

The following example uses the SQL OBJECT data type and the corresponding Java data type (Map<String, String>), and extracts a value from the OBJECT. This example also shows that you can pass multiple parameters to a Java UDF.

Create and load a table that contains a column of type OBJECT:

CREATE TABLE objectives (o OBJECT);
INSERT INTO objectives SELECT PARSE_JSON('{"outer_key" : {"inner_key" : "inner_value"} }');

Create the UDF:

create or replace function extract_from_object(x OBJECT, key VARCHAR)
returns variant
language java
handler='VariantLibrary.extract'
target_path='@~/VariantLibrary.jar'
as
$$
import java.util.Map;
class VariantLibrary {
  public static String extract(Map<String, String> m, String key) {
    return m.get(key);
  }
}
$$;

Call the UDF:

SELECT extract_from_object(o, 'outer_key'), 
       extract_from_object(o, 'outer_key')['inner_key'] FROM objectives;
+-------------------------------------+--------------------------------------------------+
| EXTRACT_FROM_OBJECT(O, 'OUTER_KEY') | EXTRACT_FROM_OBJECT(O, 'OUTER_KEY')['INNER_KEY'] |
|-------------------------------------+--------------------------------------------------|
| {                                   | "inner_value"                                    |
|   "inner_key": "inner_value"        |                                                  |
| }                                   |                                                  |
+-------------------------------------+--------------------------------------------------+

Passing an ARRAY to an In-line Java UDF

The following example uses the SQL ARRAY data type.

Create the UDF:

create or replace function multiple_functions_in_jar(array1 array)
returns varchar
language java
handler='TemporaryTestLibrary.handleStrings'
target_path='@~/TemporaryTestLibrary.jar'
as
$$
class TemporaryTestLibrary {
  public static String handleStrings(String[] strings) {
    return concatenate(strings);
  }
  public static String concatenate(String[] strings) {
    int numberOfStrings = strings.length;
    String concatenated = "";
    for (int i = 0; i < numberOfStrings; i++)  {
        concatenated = concatenated + " " + strings[i];
    }
    return concatenated;
  }
}
$$;

Call the UDF:

SELECT multiple_functions_in_jar(ARRAY_CONSTRUCT('Hello', 'world'));
+--------------------------------------------------------------+
| MULTIPLE_FUNCTIONS_IN_JAR(ARRAY_CONSTRUCT('HELLO', 'WORLD')) |
|--------------------------------------------------------------|
|  Hello world                                                 |
+--------------------------------------------------------------+

Passing a GEOGRAPHY Value to an In-line Java UDF

The following example uses the SQL GEOGRAPHY data type.

Create the UDF:

create or replace function geographyEquals(x GEOGRAPHY, y GEOGRAPHY)
    returns boolean
    language java
    packages=('com.snowflake:snowpark:1.2.0')
    handler='TestGeography.compute'
    as
    $$
        import com.snowflake.snowpark_java.types.Geography;

        class TestGeography {

           public static boolean compute(Geography geo1, Geography geo2) {
             return geo1.equals(geo2);
           }

        }
    $$;

You can use the PACKAGES clause to specify a Snowflake system package such as the Snowpark package. When you do, you don’t need to also include the Snowpark JAR file as a value of an IMPORTS clause. For more on PACKAGES, see CREATE FUNCTION optional parameters.

Create data and call the UDF with that data:

create table geocache_table (id INTEGER, g1 GEOGRAPHY, g2 GEOGRAPHY);

insert into geocache_table (id, g1, g2) select
    1, TO_GEOGRAPHY('POINT(-122.35 37.55)'), TO_GEOGRAPHY('POINT(-122.35 37.55)');
insert into geocache_table (id, g1, g2) select
    2, TO_GEOGRAPHY('POINT(-122.35 37.55)'), TO_GEOGRAPHY('POINT(90.0 45.0)');

select id, g1, g2, geographyEquals(g1, g2) as "EQUAL?"
   from geocache_table
   order by id;

The output looks similar to:

+----+--------------------------------------------------------+---------------------------------------------------------+--------+
| ID | G1                                                     | G2                                                      | EQUAL? |
+----+--------------------------------------------------------|---------------------------------------------------------+--------+
| 1  | { "coordinates": [ -122.35, 37.55 ], "type": "Point" } | { "coordinates": [ -122.35,  37.55 ], "type": "Point" } | TRUE   |
| 2  | { "coordinates": [ -122.35, 37.55 ], "type": "Point" } | { "coordinates": [   90.0,   45.0  ], "type": "Point" } | FALSE  |
+----+--------------------------------------------------------+---------------------------------------------------------+--------+

Passing a VARIANT Value to an In-line Java UDF

When you pass a value of the SQL VARIANT type to a Java UDF, Snowflake can convert the value to the Variant type provided with the Snowpark package. Note that Variant is supported from the Snowpark package version 1.4.0 and later.

The Snowpark Variant type provides methods for converting values between Variant and other types.

To use the Snowpark Variant type, use the PACKAGES clause to specify the Snowpark package when creating the UDF. When you do, you don’t need to also include the Snowpark JAR file as a value of an IMPORTS clause. For more information on PACKAGES, see CREATE FUNCTION optional parameters.

Code in the following example receives JSON data stored as the VARIANT type, then uses the Variant type in the Snowpark library to retrieve the price value from the JSON. The received JSON has a structure similar to the JSON displayed in Sample Data Used in Examples.

create or replace function retrieve_price(v VARIANT)
returns INTEGER
language java
packages=('com.snowflake:snowpark:1.4.0')
handler='VariantTest.retrievePrice'
as
$$
import java.util.Map;
import com.snowflake.snowpark_java.types.Variant;

public class VariantTest {

  static Integer total = 0;

  public static Integer retrievePrice(Variant v) throws Exception {
    Map<String, Variant> saleMap = v.asMap();
    int price = saleMap.get("vehicle").asMap().get("price").asInt();
    return price;
  }
}
$$;

Pre-compiled Java UDFs

Creating and Calling a Simple Pre-compiled Java UDF

The following statements create a simple Java UDF. This sample generally follows the file and directory structure described in Organizing Your Files.

Create a java file that contains your source code:

package mypackage;

public class my_decrement_udf_class
{

  public static int my_decrement_udf_method(int i)
  {
      return i - 1;
  }


  public static void main(String[] argv)
  {
      System.out.println("This main() function won't be called.");
  }

}

Optionally, create a manifest file similar to the one shown below:

Manifest-Version: 1.0
Main-Class: my_decrement_udf_class.class

Compile the source code. This example stores the generated .class file(s) in the directory named classDirectory.

javac -d classDirectory my_decrement_udf_package/my_decrement_udf_class.java

Create the JAR file from the .class file. This example uses “-C classDirectory” to specify the location of the .class files:

jar cmf my_decrement_udf_manifest.manifest ./my_decrement_udf_jar.jar -C classDirectory my_decrement_udf_package/my_decrement_udf_class.class

Use the PUT command to copy the JAR file from the local file system to a stage. This example uses the user’s default stage, named @~:

put
    file:///Users/Me/JavaUDFExperiments/my_decrement_udf_jar.jar
    @~/my_decrement_udf_package_dir/
    auto_compress = false
    overwrite = true
    ;

You can store the PUT command in a script file and then execute that file through snowsql. The snowsql command looks similar to the following:

snowsql -a <account_identifier> -w <warehouse> -d <database> -s <schema> -u <user> -f put_command.sql

This example assumes that the user’s password is specified in the SNOWSQL_PWD environment variable.

Create the UDF:

create function my_decrement_udf(i numeric(9, 0))
    returns numeric
    language java
    imports = ('@~/my_decrement_udf_package_dir/my_decrement_udf_jar.jar')
    handler = 'my_decrement_udf_package.my_decrement_udf_class.my_decrement_udf_method'
    ;

Call the UDF:

SELECT my_decrement_udf(-15);
+-----------------------+
| MY_DECREMENT_UDF(-15) |
|-----------------------|
|                   -16 |
+-----------------------+
Back to top