Java UDF handler examples

This topic includes simple examples of UDF handler code written in Java.

For more on using Java to create a UDF handler, see Creating a Java UDF handler.

Creating and calling a simple in-line Java UDF

The following statements create and call an in-line Java UDF. This code returns the VARCHAR passed to it.

This function is declared with the optional CALLED ON NULL INPUT clause to indicate that the function is called even if the value of the input is NULL. (This function would return NULL with or without this clause, but you could modify the code to handle NULL another way, for example, to return an empty string.)

Create the UDF:

create or replace function echo_varchar(x varchar)
returns varchar
language java
called on null input
handler='TestFunc.echoVarchar'
target_path='@~/testfunc.jar'
as
'class TestFunc {
  public static String echoVarchar(String x) {
    return x;
  }
}';
Copy

Call the UDF:

SELECT echo_varchar('Hello');
+-----------------------+
| ECHO_VARCHAR('HELLO') |
|-----------------------|
| Hello                 |
+-----------------------+
Copy

Passing a NULL to an in-line Java UDF

This uses the echo_varchar() UDF defined above. The SQL NULL value is implicitly converted to Java null, and that Java null is returned and implicitly converted back to SQL NULL:

Call the UDF:

SELECT echo_varchar(NULL);
+--------------------+
| ECHO_VARCHAR(NULL) |
|--------------------|
| NULL               |
+--------------------+
Copy

Passing array values

Java methods can receive SQL arrays in either of two ways:

  • Using Java’s array feature.

  • Using Java’s varargs (variable number of arguments) feature.

In both cases, your SQL code must pass an ARRAY.

Note

Be sure to use Java types with valid mappings to SQL types. For more information, refer to SQL-Java Data Type Mappings.

Passing via an ARRAY

Declare the Java parameter as an array. For example, the third parameter in the following method is a String array:

static int myMethod(int fixedArgument1, int fixedArgument2, String[] stringArray)
Copy

Below is a complete example:

Create and load the table:

CREATE TABLE string_array_table(id INTEGER, a ARRAY);
INSERT INTO string_array_table (id, a) SELECT
        1, ARRAY_CONSTRUCT('Hello');
INSERT INTO string_array_table (id, a) SELECT
        2, ARRAY_CONSTRUCT('Hello', 'Jay');
INSERT INTO string_array_table (id, a) SELECT
        3, ARRAY_CONSTRUCT('Hello', 'Jay', 'Smith');
Copy

Create the UDF:

create or replace function concat_varchar_2(a ARRAY)
returns varchar
language java
handler='TestFunc_2.concatVarchar2'
target_path='@~/TestFunc_2.jar'
as
$$
    class TestFunc_2 {
        public static String concatVarchar2(String[] strings) {
            return String.join(" ", strings);
        }
    }
$$;
Copy

Call the UDF:

SELECT concat_varchar_2(a)
    FROM string_array_table
    ORDER BY id;
+---------------------+
| CONCAT_VARCHAR_2(A) |
|---------------------|
| Hello               |
| Hello Jay           |
| Hello Jay Smith     |
+---------------------+
Copy

Passing via varargs

Using varargs is very similar to using an array.

In your Java code, use Java’s varargs declaration style:

static int myMethod(int fixedArgument1, int fixedArgument2, String ... stringArray)
Copy

Below is a complete example. The only significant difference between this example and the preceding example (for arrays) is the declaration of the parameters to the method.

Create and load the table:

CREATE TABLE string_array_table(id INTEGER, a ARRAY);
INSERT INTO string_array_table (id, a) SELECT
        1, ARRAY_CONSTRUCT('Hello');
INSERT INTO string_array_table (id, a) SELECT
        2, ARRAY_CONSTRUCT('Hello', 'Jay');
INSERT INTO string_array_table (id, a) SELECT
        3, ARRAY_CONSTRUCT('Hello', 'Jay', 'Smith');
Copy

Create the UDF:

create or replace function concat_varchar(a ARRAY)
returns varchar
language java
handler='TestFunc.concatVarchar'
target_path='@~/TestFunc.jar'
as
$$
    class TestFunc {
        public static String concatVarchar(String ... stringArray) {
            return String.join(" ", stringArray);
        }
    }
$$;
Copy

Call the UDF:

SELECT concat_varchar(a)
    FROM string_array_table
    ORDER BY id;
+-------------------+
| CONCAT_VARCHAR(A) |
|-------------------|
| Hello             |
| Hello Jay         |
| Hello Jay Smith   |
+-------------------+
Copy

Returning NULL explicitly from an in-line UDF

The following code shows how to return a NULL value explicitly. The Java value null is converted to SQL NULL.

Create the UDF:

create or replace function return_a_null()
returns varchar
null
language java
handler='TemporaryTestLibrary.returnNull'
target_path='@~/TemporaryTestLibrary.jar'
as
$$
class TemporaryTestLibrary {
  public static String returnNull() {
    return null;
  }
}
$$;
Copy

Call the UDF:

SELECT return_a_null();
+-----------------+
| RETURN_A_NULL() |
|-----------------|
| NULL            |
+-----------------+
Copy

Passing an OBJECT to an in-line Java UDF

The following example uses the SQL OBJECT data type and the corresponding Java data type (Map<String, String>), and extracts a value from the OBJECT. This example also shows that you can pass multiple parameters to a Java UDF.

Create and load a table that contains a column of type OBJECT:

CREATE TABLE objectives (o OBJECT);
INSERT INTO objectives SELECT PARSE_JSON('{"outer_key" : {"inner_key" : "inner_value"} }');
Copy

Create the UDF:

create or replace function extract_from_object(x OBJECT, key VARCHAR)
returns variant
language java
handler='VariantLibrary.extract'
target_path='@~/VariantLibrary.jar'
as
$$
import java.util.Map;
class VariantLibrary {
  public static String extract(Map<String, String> m, String key) {
    return m.get(key);
  }
}
$$;
Copy

Call the UDF:

SELECT extract_from_object(o, 'outer_key'), 
       extract_from_object(o, 'outer_key')['inner_key'] FROM objectives;
+-------------------------------------+--------------------------------------------------+
| EXTRACT_FROM_OBJECT(O, 'OUTER_KEY') | EXTRACT_FROM_OBJECT(O, 'OUTER_KEY')['INNER_KEY'] |
|-------------------------------------+--------------------------------------------------|
| {                                   | "inner_value"                                    |
|   "inner_key": "inner_value"        |                                                  |
| }                                   |                                                  |
+-------------------------------------+--------------------------------------------------+
Copy

Passing a GEOGRAPHY value to an in-line Java UDF

The following example uses the SQL GEOGRAPHY data type.

Create the UDF:

create or replace function geography_equals(x geography, y geography)
returns boolean
language java
packages=('com.snowflake:snowpark:1.2.0')
handler='TestGeography.compute'
as
$$
import com.snowflake.snowpark_java.types.Geography;

class TestGeography {
  public static boolean compute(Geography geo1, Geography geo2) {
    return geo1.equals(geo2);
  }
}
$$;
Copy

You can use the PACKAGES clause to specify a Snowflake system package such as the Snowpark package. When you do, you don’t need to also include the Snowpark JAR file as a value of an IMPORTS clause. For more on PACKAGES, see CREATE FUNCTION optional parameters.

Create data and call the UDF with that data:

create table geocache_table (id INTEGER, g1 GEOGRAPHY, g2 GEOGRAPHY);

insert into geocache_table (id, g1, g2) select
    1, TO_GEOGRAPHY('POINT(-122.35 37.55)'), TO_GEOGRAPHY('POINT(-122.35 37.55)');
insert into geocache_table (id, g1, g2) select
    2, TO_GEOGRAPHY('POINT(-122.35 37.55)'), TO_GEOGRAPHY('POINT(90.0 45.0)');

select id, g1, g2, geography_equals(g1, g2) as "EQUAL?"
    from geocache_table
    order by id;
Copy

The output looks similar to:

+----+--------------------------------------------------------+---------------------------------------------------------+--------+
| ID | G1                                                     | G2                                                      | EQUAL? |
+----+--------------------------------------------------------|---------------------------------------------------------+--------+
| 1  | { "coordinates": [ -122.35, 37.55 ], "type": "Point" } | { "coordinates": [ -122.35,  37.55 ], "type": "Point" } | TRUE   |
| 2  | { "coordinates": [ -122.35, 37.55 ], "type": "Point" } | { "coordinates": [   90.0,   45.0  ], "type": "Point" } | FALSE  |
+----+--------------------------------------------------------+---------------------------------------------------------+--------+
Copy

Passing a VARIANT value to an in-line Java UDF

When you pass a value of the SQL VARIANT type to a Java UDF, Snowflake can convert the value to the Variant type provided with the Snowpark package. Note that Variant is supported from the Snowpark package version 1.4.0 and later.

The Snowpark Variant type provides methods for converting values between Variant and other types.

To use the Snowpark Variant type, use the PACKAGES clause to specify the Snowpark package when creating the UDF. When you do, you don’t need to also include the Snowpark JAR file as a value of an IMPORTS clause. For more information on PACKAGES, see CREATE FUNCTION optional parameters.

Code in the following example receives JSON data stored as the VARIANT type, then uses the Variant type in the Snowpark library to retrieve the price value from the JSON. The received JSON has a structure similar to the JSON displayed in Sample Data Used in Examples.

create or replace function retrieve_price(v variant)
returns integer
language java
packages=('com.snowflake:snowpark:1.4.0')
handler='VariantTest.retrievePrice'
as
$$
import java.util.Map;
import com.snowflake.snowpark_java.types.Variant;

public class VariantTest {
  public static Integer retrievePrice(Variant v) throws Exception {
    Map<String, Variant> saleMap = v.asMap();
    int price = saleMap.get("vehicle").asMap().get("price").asInt();
    return price;
  }
}
$$;
Copy

Reading a file with a Java UDF

You can read the contents of a file with handler code. For example, you might want to read a file to process unstructured data with the handler. For more information on processing unstructured data, along with example code, refer to Processing unstructured data with UDF and procedure handlers.

The file must be on a Snowflake stage that’s available to your handler.

To read the contents of staged files, your handler can:

  • Read a file whose file path is statically-specified in the IMPORTS clause. At run time, your code reads the file from the UDF’s home directory.

    This can be useful when you want to access the file during initialization.

  • Read a dynamically-specified file by calling methods of either the SnowflakeFile class or the InputStream class.

    You might do this if you need to access a file specified by the caller. For more information, see the following in this topic:

    SnowflakeFile provides features not available with InputStream, as described in the following table.

    Class

    Input

    Notes

    SnowflakeFile

    URL formats:

    • Scoped URL to reduce the risk of file injection attacks when the function’s caller is not also its owner.

    • File URL or string path for files that the UDF owner has access to.

    The file must be located in a named internal stage or an external stage.

    Easily access additional file attributes, such as file size.

    InputStream

    URL formats:

    • Scoped URL to reduce the risk of file injection attacks when the function’s caller is not also its owner.

    The file must be located in an internal or external stage.

Prerequisites

Before your Java handler code can read a file on a stage, you must do the following to make the file available to the code:

  1. Create a stage that’s available to your handler.

    You can use an external stage or internal stage. If you use an internal stage, it must be a user or named stage; Snowflake does not currently support using a table stage for UDF dependencies. For more on creating a stage, see CREATE STAGE. For more on choosing an internal stage type, see Choosing an internal stage for local files.

    Keep in mind that adequate privileges on the stage must be assigned to roles performing SQL actions that read from the stage. For more information, see Granting privileges for user-defined functions.

  2. To the stage, copy the file that will be read by code.

    You can copy the file from a local drive to a stage by using the PUT command. For command reference, see PUT. For information on staging files with PUT, see Staging data files from a local file system.

Reading a file specified statically in IMPORTS

Your handler can read a file whose stage path has been specified in the IMPORTS clause of the CREATE FUNCTION command.

When you specify a file in the IMPORTS clause, Snowflake copies that file from the stage to the UDF’s home directory (also called the import directory), which is the directory from which the UDF actually reads the file.

Because imported files are copied to a single directory and must have unique names within that directory, each file in the IMPORTS clause must have a distinct name, even if the files start out in different stages or different subdirectories within a stage.

The following example creates and calls a Java UDF that reads a file.

The Java source code below creates a Java method named readFile. This UDF uses this method.

import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.stream.Stream;

class TestReadRelativeFile {
    public static String readFile(String fileName) throws IOException {
        StringBuilder contentBuilder = new StringBuilder();
        String importDirectory = System.getProperty("com.snowflake.import_directory");
        String fPath = importDirectory + fileName;
        Stream<String> stream = Files.lines(Paths.get(fPath), StandardCharsets.UTF_8);
        stream.forEach(s -> contentBuilder.append(s).append("\n"));
        return contentBuilder.toString();
    }
}
Copy

The following SQL code creates the UDF. This code assumes that the Java source code has been compiled and put into a JAR file named TestReadRelativeFile.jar, which the UDF imports. The second and third imported files, my_config_file_1.txt and my_config_file_2.txt, are configuration files that the UDF can read.

CREATE FUNCTION file_reader(file_name VARCHAR)
RETURNS VARCHAR
LANGUAGE JAVA
IMPORTS = ('@my_stage/my_package/TestReadRelativeFile.jar',
           '@my_stage/my_path/my_config_file_1.txt',
           '@my_stage/my_path/my_config_file_2.txt')
HANDLER = 'my_package.TestReadRelativeFile.readFile';
Copy

This code calls the UDF:

SELECT file_reader('my_config_file_1.txt') ...;
...
SELECT file_reader('my_config_file_2.txt') ...;
Copy

Choosing whether to access a file in compressed or uncompressed format

Files in a stage can be stored in compressed or uncompressed format. Users can compress the file before copying it to the stage, or can tell the PUT command to compress the file.

When Snowflake copies a file compressed in GZIP format from a stage to the UDF home directory, Snowflake can write the copy as-is, or Snowflake can decompress the content before writing the file.

If the file in the stage is compressed, and if you would like the copy in the UDF home directory to also be compressed, then when you specify the file name in the IMPORTS clause, simply use the original file name (e.g. “MyData.txt.gz”) in the IMPORTS clause. For example:

... imports = ('@MyStage/MyData.txt.gz', ...)
Copy

If the file in the stage is GZIP-compressed, but you would like the copy in the UDF home directory to be uncompressed, then when you specify the file name in the IMPORTS clause, omit the “.gz” extension. For example, if your stage contains “MyData.txt.gz”, but you want your UDF to read the file in uncompressed format, then specify “MyData.txt” in the IMPORTS clause. If there is not already an uncompressed file named “MyData.txt”, then Snowflake searches for “MyData.txt.gz” and automatically writes a decompressed copy to “MyData.txt” in the UDF home directory. Your UDF can then open and read the uncompressed file “MyData.txt”.

Note that smart decompression applies only to the copy in the UDF home directory; the original file in the stage is not changed.

Follow these best practices for handling compressed files:

  • Follow proper file naming conventions. If a file is in GZIP-compressed format, then include the extension “.gz” at the end of the file name. If a file is not in GZIP-compressed format, then do not end the file name with the “.gz” extension.

  • Avoid creating files whose names differ only by the extension “.gz”. For example, do not create both “MyData.txt” and “MyData.txt.gz” in the same stage and directory, and do not try to import both “MyData.txt” and “MyData.txt.gz” in the same CREATE FUNCTION command.

  • Do not compress files twice. For example, if you compress a file manually, and then you PUT that file without using AUTO_COMPRESS=FALSE, the file will be compressed a second time. Smart decompression will decompress it only once, so the data (or JAR) file will still be compressed when it is stored in the UDF home directory.

  • In the future, Snowflake might extend smart decompression to compression algorithms other than GZIP. To prevent compatibility issues in the future, apply these best practices to files that use any type of compression.

Note

JAR files can also be stored in compressed or uncompressed format in a stage. Snowflake automatically decompresses all compressed JAR files before making them available to the Java UDF.

Reading a dynamically-specified file with SnowflakeFile

Using methods of the SnowflakeFile class, you can read files from a stage with your Java handler code. The SnowflakeFile class is included on the classpath available to Java UDF handlers on Snowflake.

Note

To make your code resilient to file injection attacks, always use a scoped URL when passing a file’s location to a UDF, particularly when the function’s caller is not also its owner. You can create a scoped URL in SQL using the built-in function BUILD_SCOPED_FILE_URL. For more information about what the BUILD_SCOPED_FILE_URL does, see Introduction to unstructured data.

To develop your UDF code locally, add the Snowpark JAR containing SnowflakeFile to your code’s class path. For information about snowpark.jar, see Setting Up Your Development Environment for Snowpark Java. Note that Snowpark client applications cannot use this class.

When you use SnowflakeFile, it isn’t necessary to also specify either the staged file or the JAR containing SnowflakeFile with an IMPORTS clause when you create the UDF, as in SQL with a CREATE FUNCTION statement.

Code in the following example uses SnowflakeFile to read a file from a specified stage location. Using an InputStream from the getInputStream method, it reads the file’s contents into a String variable.

CREATE OR REPLACE FUNCTION sum_total_sales(file string)
RETURNS INTEGER
LANGUAGE JAVA
HANDLER = 'SalesSum.sumTotalSales'
TARGET_PATH = '@jar_stage/sales_functions2.jar'
AS
$$
import java.io.InputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import com.snowflake.snowpark_java.types.SnowflakeFile;

public class SalesSum {

  public static int sumTotalSales(String filePath) throws IOException {
    int total = -1;

    // Use a SnowflakeFile instance to read sales data from a stage.
    SnowflakeFile file = SnowflakeFile.newInstance(filePath);
    InputStream stream = file.getInputStream();
    String contents = new String(stream.readAllBytes(), StandardCharsets.UTF_8);

    // Omitted for brevity: code to retrieve sales data from JSON and assign it to the total variable.

    return total;
  }
}
$$;
Copy

Call the UDF, passing the location of the file in a scoped URL to reduce the likelihood of file injection attacks.

SELECT sum_total_sales(BUILD_SCOPED_FILE_URL('@sales_data_stage', '/car_sales.json'));
Copy

Note

The UDF owner must have access to any files whose locations are not scoped URLs. You can read these staged files by having the handler code call the SnowflakeFile.newInstance method with a boolean value for a new requireScopedUrl parameter.

The following example uses SnowflakeFile.newInstance while specifying that a scoped URL is not required.

String filename = "@my_stage/filename.txt";
String sfFile = SnowflakeFile.newInstance(filename, false);
Copy

Reading a dynamically-specified file with InputStream

You can read file contents directly into a java.io.InputStream by making your handler function’s argument an InputStream variable. This can be useful when the function’s caller will want to pass a file path as an argument.

Note

To make your code resilient to file injection attacks, always use a scoped URL when passing a file’s location to a UDF, particularly when the function’s caller is not also its owner. You can create a scoped URL in SQL using the built-in function BUILD_SCOPED_FILE_URL. For more information about what the BUILD_SCOPED_FILE_URL does, see Introduction to unstructured data.

Code in the following example has a handler function sumTotalSales that takes an InputStream and returns an int. At run time, Snowflake automatically assigns the contents of the file at the file variable’s path to the stream argument variable.

CREATE OR REPLACE FUNCTION sum_total_sales(file string)
RETURNS INTEGER
LANGUAGE JAVA
HANDLER = 'SalesSum.sumTotalSales'
TARGET_PATH = '@jar_stage/sales_functions2.jar'
AS
$$
import java.io.InputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;

public class SalesSum {

  public static int sumTotalSales(InputStream stream) throws IOException {
    int total = -1;
    String contents = new String(stream.readAllBytes(), StandardCharsets.UTF_8);

    // Omitted for brevity: code to retrieve sales data from JSON and assign it to the total variable.

    return total;
  }
}
$$;
Copy

Call the UDF, passing the location of the file in a scoped URL to reduce the likelihood of file injection attacks.

SELECT sum_total_sales(BUILD_SCOPED_FILE_URL('@sales_data_stage', '/car_sales.json'));
Copy

Creating and calling a simple staged Java UDF

The following statements create a simple Java UDF. This sample generally follows the file and directory structure described in Organizing your files.

Create and compile the Java handler code

  1. In the root directory of your project (here, my_udf), create a src subdirectory to hold the source .java files and a classes subdirectory to hold the generated .class files.

    You should have a directory hierarchy similar to the following:

    my_udf/
    |-- classes/
    |-- src/
    
    Copy
  2. In the src directory, create a directory called mypackage to hold .java files whose classes are in the mypackage package.

  3. In the mypackage directory, create a MyUDFHandler.java file that contains your source code.

    package mypackage;
    
    public class MyUDFHandler {
    
      public static int decrementValue(int i)
      {
          return i - 1;
      }
    
      public static void main(String[] argv)
      {
          System.out.println("This main() function won't be called.");
      }
    }
    
    Copy
  4. From your project root directory (here, my_udf), use the javac command to compile the source code.

    The javac command in the following example compiles MyUDFHandler.java to generate a MyUDFHandler.class file in the classes directory.

    javac -d classes src/mypackage/MyUDFHandler.java
    
    Copy

    This example includes the following arguments:

    • -d classes – Directory into which generated class files should be written.

    • src/mypackage/MyUDFHandler.java – Path to the .java file in the form: source_directory/package_directory/Java_file_name.

Package the compiled code into a JAR file

  1. Optionally, in the project root directory create a manifest file named my_udf.manifest that contains the following attributes:

    Manifest-Version: 1.0
    Main-Class: mypackage.MyUDFHandler
    
    Copy
  2. From your project root directory, run the jar command to create a JAR file containing the .class file and manifest.

    The jar command in the following example puts the generated MyUDFHandler.class file in a mypackage package folder into a .jar file called my_udf.jar. The -C ./classes flag specifies the location of the .class files.

    jar cmf my_udf.manifest my_udf.jar -C ./classes mypackage/MyUDFHandler.class
    
    Copy

    This example includes the following arguments:

    • cmf – Command arguments: c to create a JAR file, m to use the specified .manifest file, and f to give the JAR file the specified name.

    • my_udf.manifest – Manifest file.

    • my_udf.jar – Name of the JAR file to create.

    • -C ./classes – Directory containing the generated .class files.

    • mypackage/MyUDFHandler.class – Package and name of .class file to include in the JAR.

Upload the JAR file with the compiled handler to a stage

  1. In Snowflake, create a stage called jar_stage to store the JAR file containing your UDF handler.

    For more information on creating a stage, see CREATE STAGE.

  2. Use the PUT command to copy the JAR file from the local file system to a stage.

put
    file:///Users/Me/my_udf/my_udf.jar
    @jar_stage
    auto_compress = false
    overwrite = true
    ;
Copy

You can store the PUT command in a script file and then execute that file through SnowSQL.

The snowsql command looks similar to the following:

snowsql -a <account_identifier> -w <warehouse> -d <database> -s <schema> -u <user> -f put_command.sql
Copy

This example assumes that the user’s password is specified in the SNOWSQL_PWD environment variable.

Create the UDF with the compiled code as handler

Create the UDF:

create function decrement_value(i numeric(9, 0))
  returns numeric
  language java
  imports = ('@jar_stage/my_udf.jar')
  handler = 'mypackage.MyUDFHandler.decrementValue'
  ;
Copy

Call the UDF:

select decrement_value(-15);
+----------------------+
| DECREMENT_VALUE(-15) |
|----------------------|
|                  -16 |
+----------------------+
Copy