Java UDF handler examples¶
This topic includes simple examples of UDF handler code written in Java.
For more on using Java to create a UDF handler, see Creating a Java UDF handler.
Creating and calling a simple in-line Java UDF¶
The following statements create and call an in-line Java UDF. This code returns the VARCHAR passed to it.
This function is declared with the optional CALLED ON NULL INPUT
clause to indicate that the function is
called even if the value of the input is NULL. (This function would return NULL with or without this clause, but
you could modify the code to handle NULL another way, for example, to return an empty string.)
Create the UDF:
create or replace function echo_varchar(x varchar)
returns varchar
language java
called on null input
handler='TestFunc.echoVarchar'
target_path='@~/testfunc.jar'
as
'class TestFunc {
public static String echoVarchar(String x) {
return x;
}
}';
Call the UDF:
SELECT echo_varchar('Hello');
+-----------------------+
| ECHO_VARCHAR('HELLO') |
|-----------------------|
| Hello |
+-----------------------+
Passing a NULL to an in-line Java UDF¶
This uses the echo_varchar()
UDF defined above. The SQL NULL
value is implicitly converted to
Java null
, and that Java null
is returned and implicitly converted back to SQL NULL
:
Call the UDF:
SELECT echo_varchar(NULL);
+--------------------+
| ECHO_VARCHAR(NULL) |
|--------------------|
| NULL |
+--------------------+
Passing array values¶
Java methods can receive SQL arrays in either of two ways:
Using Java’s array feature.
Using Java’s varargs (variable number of arguments) feature.
In both cases, your SQL code must pass an ARRAY.
Note
Be sure to use Java types with valid mappings to SQL types. For more information, refer to SQL-Java Data Type Mappings.
Passing via an ARRAY¶
Declare the Java parameter as an array. For example, the third parameter in the following method is a String array:
static int myMethod(int fixedArgument1, int fixedArgument2, String[] stringArray)
Below is a complete example:
Create and load the table:
CREATE TABLE string_array_table(id INTEGER, a ARRAY); INSERT INTO string_array_table (id, a) SELECT 1, ARRAY_CONSTRUCT('Hello'); INSERT INTO string_array_table (id, a) SELECT 2, ARRAY_CONSTRUCT('Hello', 'Jay'); INSERT INTO string_array_table (id, a) SELECT 3, ARRAY_CONSTRUCT('Hello', 'Jay', 'Smith');Create the UDF:
create or replace function concat_varchar_2(a ARRAY) returns varchar language java handler='TestFunc_2.concatVarchar2' target_path='@~/TestFunc_2.jar' as $$ class TestFunc_2 { public static String concatVarchar2(String[] strings) { return String.join(" ", strings); } } $$;Call the UDF:
SELECT concat_varchar_2(a) FROM string_array_table ORDER BY id; +---------------------+ | CONCAT_VARCHAR_2(A) | |---------------------| | Hello | | Hello Jay | | Hello Jay Smith | +---------------------+
Passing via varargs¶
Using varargs is very similar to using an array.
In your Java code, use Java’s varargs declaration style:
static int myMethod(int fixedArgument1, int fixedArgument2, String ... stringArray)
Below is a complete example. The only significant difference between this example and the preceding example (for arrays) is the declaration of the parameters to the method.
Create and load the table:
CREATE TABLE string_array_table(id INTEGER, a ARRAY); INSERT INTO string_array_table (id, a) SELECT 1, ARRAY_CONSTRUCT('Hello'); INSERT INTO string_array_table (id, a) SELECT 2, ARRAY_CONSTRUCT('Hello', 'Jay'); INSERT INTO string_array_table (id, a) SELECT 3, ARRAY_CONSTRUCT('Hello', 'Jay', 'Smith');Create the UDF:
create or replace function concat_varchar(a ARRAY) returns varchar language java handler='TestFunc.concatVarchar' target_path='@~/TestFunc.jar' as $$ class TestFunc { public static String concatVarchar(String ... stringArray) { return String.join(" ", stringArray); } } $$;Call the UDF:
SELECT concat_varchar(a) FROM string_array_table ORDER BY id; +-------------------+ | CONCAT_VARCHAR(A) | |-------------------| | Hello | | Hello Jay | | Hello Jay Smith | +-------------------+
Returning NULL explicitly from an in-line UDF¶
The following code shows how to return a NULL value explicitly. The Java value null
is converted to
SQL NULL
.
Create the UDF:
create or replace function return_a_null()
returns varchar
null
language java
handler='TemporaryTestLibrary.returnNull'
target_path='@~/TemporaryTestLibrary.jar'
as
$$
class TemporaryTestLibrary {
public static String returnNull() {
return null;
}
}
$$;
Call the UDF:
SELECT return_a_null();
+-----------------+
| RETURN_A_NULL() |
|-----------------|
| NULL |
+-----------------+
Passing an OBJECT to an in-line Java UDF¶
The following example uses the SQL OBJECT data type and the corresponding Java
data type (Map<String, String>
), and extracts a value from the OBJECT. This example also shows that you
can pass multiple parameters to a Java UDF.
Create and load a table that contains a column of type OBJECT:
CREATE TABLE objectives (o OBJECT);
INSERT INTO objectives SELECT PARSE_JSON('{"outer_key" : {"inner_key" : "inner_value"} }');
Create the UDF:
create or replace function extract_from_object(x OBJECT, key VARCHAR)
returns variant
language java
handler='VariantLibrary.extract'
target_path='@~/VariantLibrary.jar'
as
$$
import java.util.Map;
class VariantLibrary {
public static String extract(Map<String, String> m, String key) {
return m.get(key);
}
}
$$;
Call the UDF:
SELECT extract_from_object(o, 'outer_key'),
extract_from_object(o, 'outer_key')['inner_key'] FROM objectives;
+-------------------------------------+--------------------------------------------------+
| EXTRACT_FROM_OBJECT(O, 'OUTER_KEY') | EXTRACT_FROM_OBJECT(O, 'OUTER_KEY')['INNER_KEY'] |
|-------------------------------------+--------------------------------------------------|
| { | "inner_value" |
| "inner_key": "inner_value" | |
| } | |
+-------------------------------------+--------------------------------------------------+
Passing a GEOGRAPHY value to an in-line Java UDF¶
The following example uses the SQL GEOGRAPHY data type.
Create the UDF:
create or replace function geography_equals(x geography, y geography)
returns boolean
language java
packages=('com.snowflake:snowpark:1.2.0')
handler='TestGeography.compute'
as
$$
import com.snowflake.snowpark_java.types.Geography;
class TestGeography {
public static boolean compute(Geography geo1, Geography geo2) {
return geo1.equals(geo2);
}
}
$$;
You can use the PACKAGES clause to specify a Snowflake system package such as the Snowpark package. When you do, you don’t need to also include the Snowpark JAR file as a value of an IMPORTS clause. For more on PACKAGES, see CREATE FUNCTION optional parameters.
Create data and call the UDF with that data:
create table geocache_table (id INTEGER, g1 GEOGRAPHY, g2 GEOGRAPHY);
insert into geocache_table (id, g1, g2) select
1, TO_GEOGRAPHY('POINT(-122.35 37.55)'), TO_GEOGRAPHY('POINT(-122.35 37.55)');
insert into geocache_table (id, g1, g2) select
2, TO_GEOGRAPHY('POINT(-122.35 37.55)'), TO_GEOGRAPHY('POINT(90.0 45.0)');
select id, g1, g2, geography_equals(g1, g2) as "EQUAL?"
from geocache_table
order by id;
The output looks similar to:
+----+--------------------------------------------------------+---------------------------------------------------------+--------+
| ID | G1 | G2 | EQUAL? |
+----+--------------------------------------------------------|---------------------------------------------------------+--------+
| 1 | { "coordinates": [ -122.35, 37.55 ], "type": "Point" } | { "coordinates": [ -122.35, 37.55 ], "type": "Point" } | TRUE |
| 2 | { "coordinates": [ -122.35, 37.55 ], "type": "Point" } | { "coordinates": [ 90.0, 45.0 ], "type": "Point" } | FALSE |
+----+--------------------------------------------------------+---------------------------------------------------------+--------+
Passing a VARIANT value to an in-line Java UDF¶
When you pass a value of the SQL VARIANT type to a Java UDF, Snowflake can convert the value to the
Variant type
provided with the Snowpark package. Note that
Variant
is supported from the Snowpark package version 1.4.0 and later.
The Snowpark Variant
type provides methods for converting values between Variant
and other types.
To use the Snowpark Variant
type, use the PACKAGES clause to specify the Snowpark package when creating the UDF. When you do, you
don’t need to also include the Snowpark JAR file as a value of an IMPORTS clause. For more information on PACKAGES, see
CREATE FUNCTION optional parameters.
Code in the following example receives JSON data stored as the VARIANT type, then uses the Variant
type in the Snowpark library
to retrieve the price
value from the JSON. The received JSON has a structure similar to the JSON displayed in
Sample Data Used in Examples.
create or replace function retrieve_price(v variant)
returns integer
language java
packages=('com.snowflake:snowpark:1.4.0')
handler='VariantTest.retrievePrice'
as
$$
import java.util.Map;
import com.snowflake.snowpark_java.types.Variant;
public class VariantTest {
public static Integer retrievePrice(Variant v) throws Exception {
Map<String, Variant> saleMap = v.asMap();
int price = saleMap.get("vehicle").asMap().get("price").asInt();
return price;
}
}
$$;
Reading a file with a Java UDF¶
You can read the contents of a file with handler code. For example, you might want to read a file to process unstructured data with the handler. For more information on processing unstructured data, along with example code, refer to Processing unstructured data with UDF and procedure handlers.
The file must be on a Snowflake stage that’s available to your handler.
To read the contents of staged files, your handler can:
Read a file whose file path is statically-specified in the IMPORTS clause. At run time, your code reads the file from the UDF’s home directory.
This can be useful when you want to access the file during initialization.
Read a dynamically-specified file by calling methods of either the
SnowflakeFile
class or theInputStream
class.You might do this if you need to access a file specified by the caller. For more information, see the following in this topic:
SnowflakeFile
provides features not available withInputStream
, as described in the following table.Class
Input
Notes
SnowflakeFile
URL formats:
Scoped URL to reduce the risk of file injection attacks when the function’s caller is not also its owner.
File URL or string path for files that the UDF owner has access to.
The file must be located in a named internal stage or an external stage.
Easily access additional file attributes, such as file size.
InputStream
URL formats:
Scoped URL to reduce the risk of file injection attacks when the function’s caller is not also its owner.
The file must be located in an internal or external stage.
Prerequisites¶
Before your Java handler code can read a file on a stage, you must do the following to make the file available to the code:
Create a stage that’s available to your handler.
You can use an external stage or internal stage. If you use an internal stage, it must be a user or named stage; Snowflake does not currently support using a table stage for UDF dependencies. For more on creating a stage, see CREATE STAGE. For more on choosing an internal stage type, see Choosing an internal stage for local files.
Keep in mind that adequate privileges on the stage must be assigned to roles performing SQL actions that read from the stage. For more information, see Granting privileges for user-defined functions.
To the stage, copy the file that will be read by code.
You can copy the file from a local drive to a stage by using the PUT command. For command reference, see PUT. For information on staging files with PUT, see Staging data files from a local file system.
Reading a file specified statically in IMPORTS¶
Your handler can read a file whose stage path has been specified in the IMPORTS clause of the CREATE FUNCTION command.
When you specify a file in the IMPORTS clause, Snowflake copies that file from the stage to the UDF’s home directory (also called the import directory), which is the directory from which the UDF actually reads the file.
Because imported files are copied to a single directory and must have unique names within that directory, each file in the IMPORTS clause must have a distinct name, even if the files start out in different stages or different subdirectories within a stage.
The following example creates and calls a Java UDF that reads a file.
The Java source code below creates a Java method named readFile
. This UDF uses this method.
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.stream.Stream;
class TestReadRelativeFile {
public static String readFile(String fileName) throws IOException {
StringBuilder contentBuilder = new StringBuilder();
String importDirectory = System.getProperty("com.snowflake.import_directory");
String fPath = importDirectory + fileName;
Stream<String> stream = Files.lines(Paths.get(fPath), StandardCharsets.UTF_8);
stream.forEach(s -> contentBuilder.append(s).append("\n"));
return contentBuilder.toString();
}
}
The following SQL code creates the UDF. This code assumes that the Java source code has been compiled and put into a JAR
file named TestReadRelativeFile.jar
, which the UDF imports. The second and third imported files,
my_config_file_1.txt
and my_config_file_2.txt
, are configuration files that the UDF can read.
CREATE FUNCTION file_reader(file_name VARCHAR)
RETURNS VARCHAR
LANGUAGE JAVA
IMPORTS = ('@my_stage/my_package/TestReadRelativeFile.jar',
'@my_stage/my_path/my_config_file_1.txt',
'@my_stage/my_path/my_config_file_2.txt')
HANDLER = 'my_package.TestReadRelativeFile.readFile';
This code calls the UDF:
SELECT file_reader('my_config_file_1.txt') ...;
...
SELECT file_reader('my_config_file_2.txt') ...;
Choosing whether to access a file in compressed or uncompressed format¶
Files in a stage can be stored in compressed or uncompressed format. Users can compress the file before copying it to the stage, or can tell the PUT command to compress the file.
When Snowflake copies a file compressed in GZIP format from a stage to the UDF home directory, Snowflake can write the copy as-is, or Snowflake can decompress the content before writing the file.
If the file in the stage is compressed, and if you would like the copy in the UDF home directory to also be compressed, then when you specify the file name in the IMPORTS clause, simply use the original file name (e.g. “MyData.txt.gz”) in the IMPORTS clause. For example:
... imports = ('@MyStage/MyData.txt.gz', ...)
If the file in the stage is GZIP-compressed, but you would like the copy in the UDF home directory to be uncompressed, then when you specify the file name in the IMPORTS clause, omit the “.gz” extension. For example, if your stage contains “MyData.txt.gz”, but you want your UDF to read the file in uncompressed format, then specify “MyData.txt” in the IMPORTS clause. If there is not already an uncompressed file named “MyData.txt”, then Snowflake searches for “MyData.txt.gz” and automatically writes a decompressed copy to “MyData.txt” in the UDF home directory. Your UDF can then open and read the uncompressed file “MyData.txt”.
Note that smart decompression applies only to the copy in the UDF home directory; the original file in the stage is not changed.
Follow these best practices for handling compressed files:
Follow proper file naming conventions. If a file is in GZIP-compressed format, then include the extension “.gz” at the end of the file name. If a file is not in GZIP-compressed format, then do not end the file name with the “.gz” extension.
Avoid creating files whose names differ only by the extension “.gz”. For example, do not create both “MyData.txt” and “MyData.txt.gz” in the same stage and directory, and do not try to import both “MyData.txt” and “MyData.txt.gz” in the same CREATE FUNCTION command.
Do not compress files twice. For example, if you compress a file manually, and then you PUT that file without using AUTO_COMPRESS=FALSE, the file will be compressed a second time. Smart decompression will decompress it only once, so the data (or JAR) file will still be compressed when it is stored in the UDF home directory.
In the future, Snowflake might extend smart decompression to compression algorithms other than GZIP. To prevent compatibility issues in the future, apply these best practices to files that use any type of compression.
Note
JAR files can also be stored in compressed or uncompressed format in a stage. Snowflake automatically decompresses all compressed JAR files before making them available to the Java UDF.
Reading a dynamically-specified file with SnowflakeFile
¶
Using methods of the SnowflakeFile
class, you can read files from a stage with your Java handler code. The SnowflakeFile
class is included on the classpath available to Java UDF handlers on Snowflake.
Note
To make your code resilient to file injection attacks, always use a scoped URL when passing a file’s location to a UDF, particularly when the function’s caller is not also its owner. You can create a scoped URL in SQL using the built-in function BUILD_SCOPED_FILE_URL. For more information about what the BUILD_SCOPED_FILE_URL does, see Introduction to unstructured data.
To develop your UDF code locally, add the Snowpark JAR containing SnowflakeFile
to your code’s class path. For information about
snowpark.jar
, see Setting Up Your Development Environment for Snowpark Java. Note that Snowpark client applications cannot use this class.
When you use SnowflakeFile
, it isn’t necessary to also specify either the staged file or the JAR containing
SnowflakeFile
with an IMPORTS clause when you create the UDF, as in SQL with a CREATE FUNCTION statement.
Code in the following example uses SnowflakeFile
to read a file from a specified stage location. Using an
InputStream
from the getInputStream
method, it reads the file’s contents into a String
variable.
CREATE OR REPLACE FUNCTION sum_total_sales(file string)
RETURNS INTEGER
LANGUAGE JAVA
HANDLER = 'SalesSum.sumTotalSales'
TARGET_PATH = '@jar_stage/sales_functions2.jar'
AS
$$
import java.io.InputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import com.snowflake.snowpark_java.types.SnowflakeFile;
public class SalesSum {
public static int sumTotalSales(String filePath) throws IOException {
int total = -1;
// Use a SnowflakeFile instance to read sales data from a stage.
SnowflakeFile file = SnowflakeFile.newInstance(filePath);
InputStream stream = file.getInputStream();
String contents = new String(stream.readAllBytes(), StandardCharsets.UTF_8);
// Omitted for brevity: code to retrieve sales data from JSON and assign it to the total variable.
return total;
}
}
$$;
Call the UDF, passing the location of the file in a scoped URL to reduce the likelihood of file injection attacks.
SELECT sum_total_sales(BUILD_SCOPED_FILE_URL('@sales_data_stage', '/car_sales.json'));
Note
The UDF owner must have access to any files whose locations are not scoped URLs. You can read these staged files by having the handler
code call the SnowflakeFile.newInstance
method with a boolean
value for a new requireScopedUrl
parameter.
The following example uses SnowflakeFile.newInstance
while specifying that a scoped URL is not required.
String filename = "@my_stage/filename.txt";
String sfFile = SnowflakeFile.newInstance(filename, false);
Reading a dynamically-specified file with InputStream
¶
You can read file contents directly into a java.io.InputStream
by making your handler function’s argument an InputStream
variable. This can be useful when the function’s caller will want to pass a file path as an argument.
Note
To make your code resilient to file injection attacks, always use a scoped URL when passing a file’s location to a UDF, particularly when the function’s caller is not also its owner. You can create a scoped URL in SQL using the built-in function BUILD_SCOPED_FILE_URL. For more information about what the BUILD_SCOPED_FILE_URL does, see Introduction to unstructured data.
Code in the following example has a handler function sumTotalSales
that takes an InputStream
and returns an int
.
At run time, Snowflake automatically assigns the contents of the file at the file
variable’s path to the stream
argument variable.
CREATE OR REPLACE FUNCTION sum_total_sales(file string)
RETURNS INTEGER
LANGUAGE JAVA
HANDLER = 'SalesSum.sumTotalSales'
TARGET_PATH = '@jar_stage/sales_functions2.jar'
AS
$$
import java.io.InputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
public class SalesSum {
public static int sumTotalSales(InputStream stream) throws IOException {
int total = -1;
String contents = new String(stream.readAllBytes(), StandardCharsets.UTF_8);
// Omitted for brevity: code to retrieve sales data from JSON and assign it to the total variable.
return total;
}
}
$$;
Call the UDF, passing the location of the file in a scoped URL to reduce the likelihood of file injection attacks.
SELECT sum_total_sales(BUILD_SCOPED_FILE_URL('@sales_data_stage', '/car_sales.json'));
Creating and calling a simple staged Java UDF¶
The following statements create a simple Java UDF. This sample generally follows the file and directory structure described in Organizing your files.
Create and compile the Java handler code¶
In the root directory of your project (here,
my_udf
), create asrc
subdirectory to hold the source .java files and aclasses
subdirectory to hold the generated .class files.You should have a directory hierarchy similar to the following:
my_udf/ |-- classes/ |-- src/
In the
src
directory, create a directory calledmypackage
to hold .java files whose classes are in themypackage
package.In the
mypackage
directory, create aMyUDFHandler.java
file that contains your source code.package mypackage; public class MyUDFHandler { public static int decrementValue(int i) { return i - 1; } public static void main(String[] argv) { System.out.println("This main() function won't be called."); } }
From your project root directory (here,
my_udf
), use thejavac
command to compile the source code.The
javac
command in the following example compilesMyUDFHandler.java
to generate aMyUDFHandler.class
file in theclasses
directory.javac -d classes src/mypackage/MyUDFHandler.java
This example includes the following arguments:
-d classes
– Directory into which generated class files should be written.src/mypackage/MyUDFHandler.java
– Path to the .java file in the form:source_directory/package_directory/Java_file_name
.
Package the compiled code into a JAR file¶
Optionally, in the project root directory create a manifest file named
my_udf.manifest
that contains the following attributes:Manifest-Version: 1.0 Main-Class: mypackage.MyUDFHandler
From your project root directory, run the
jar
command to create a JAR file containing the .class file and manifest.The
jar
command in the following example puts the generatedMyUDFHandler.class
file in amypackage
package folder into a .jar file calledmy_udf.jar
. The-C ./classes
flag specifies the location of the .class files.jar cmf my_udf.manifest my_udf.jar -C ./classes mypackage/MyUDFHandler.class
This example includes the following arguments:
cmf
– Command arguments:c
to create a JAR file,m
to use the specified .manifest file, andf
to give the JAR file the specified name.my_udf.manifest
– Manifest file.my_udf.jar
– Name of the JAR file to create.-C ./classes
– Directory containing the generated .class files.mypackage/MyUDFHandler.class
– Package and name of .class file to include in the JAR.
Upload the JAR file with the compiled handler to a stage¶
In Snowflake, create a stage called
jar_stage
to store the JAR file containing your UDF handler.For more information on creating a stage, see CREATE STAGE.
Use the
PUT
command to copy the JAR file from the local file system to a stage.
put file:///Users/Me/my_udf/my_udf.jar @jar_stage auto_compress = false overwrite = true ;You can store the
PUT
command in a script file and then execute that file through SnowSQL.The
snowsql
command looks similar to the following:snowsql -a <account_identifier> -w <warehouse> -d <database> -s <schema> -u <user> -f put_command.sqlThis example assumes that the user’s password is specified in the SNOWSQL_PWD environment variable.
Create the UDF with the compiled code as handler¶
Create the UDF:
create function decrement_value(i numeric(9, 0))
returns numeric
language java
imports = ('@jar_stage/my_udf.jar')
handler = 'mypackage.MyUDFHandler.decrementValue'
;
Call the UDF:
select decrement_value(-15);
+----------------------+
| DECREMENT_VALUE(-15) |
|----------------------|
| -16 |
+----------------------+