Introduction to Java UDFs

This topic introduces Java UDFs and provides information to help you decide when to use a Java UDF.

In this Topic:

What is a Java UDF?

A UDF (user-defined function) is a user-written function that can be called from Snowflake in the same way that a built-in function can be called.

Snowflake supports UDFs written in multiple languages, including Java.

For each row passed to a Java UDF, the UDF returns either a scalar (i.e. single) value or, if defined as a table function, a set of rows.

UDFs accept 0 or more parameters.

When a user calls a UDF, the user passes the name of the UDF and the parameters of the UDF to Snowflake. If the UDF is a Java UDF, Snowflake calls the appropriate Java code (called a handler method) in a JAR file. The handler method then returns the output to Snowflake, which passes it back to the client. Below is a simplified illustration of the data flow:

UDF Data Flow

Java UDFs can contain both new code and calls to existing libraries, allowing you both flexibility and code reuse. For example, if you already have data analysis code in Java, then you can probably incorporate that into a Java UDF.

Snowflake currently supports writing Java UDFs in the following versions of Java:

  • 11.x

Deciding When to Use a Java UDF

This section describes advantages and limitations of Java UDFs.

For information about other potential languages in which to write UDFs, and for comparisons among those languages, see Overview of UDFs.

For information about other ways to extend Snowflake, see:

Advantages of Java UDFs

Java UDFs are particularly appropriate when one or more of the following are true:

  • You already have Java code (source or compiled) that you can use.

  • Your code uses (or could use) functions that already exist in standard Java libraries.

  • You know Java as well as or better than the other languages that support UDFs.

Limitations on Java UDFs

General Limitations

  • Although your Java method can use classes and methods in the standard Java libraries, Snowflake security constraints disable some capabilities, such as writing to files. For details, see the section titled Following Good Security Practices.

  • Java UDFs are not sharable. Database objects that use Java UDFs are also not sharable. For example, you cannot:

    • Directly share a Java UDF.

    • Share a view that calls a Java UDF.

    • Share a function that calls a Java UDF.

    • Share a table with a masking or row access policy that calls a Java UDF.

  • If you try to create a Java UDF using the SECURE option (CREATE SECURE FUNCTION...), Snowflake returns an error. The SECURE option is not yet supported for Java UDFs.

  • Granting USAGE privilege on a Java UDF might allow the recipient to see the contents of files imported by that UDF. If you grant the USAGE privilege on a Java UDF to a role, and if that role executes a statement that calls that Java UDF, then any Java UDF in the same statement could read the contents of any files imported by the Java UDF on which you granted USAGE privilege.

  • Database replication does not include external or internal stages yet. When you promote a secondary database to serve as the primary database, you must recreate stage objects and re-import any files missing in internal stages. The files should have the same path and filenames as in the original primary database.

  • The maximum size for a Java UDF output row is 16 MB.

Limitations on Cloning

A Java UDF can be cloned when the database or schema containing the Java UDF is cloned. To be cloned, the Java UDF must meet the following condition(s):

  • If the Java UDF references a stage (for example, the stage that contains the UDF’s JAR file), that stage must be outside the schema (or database) being cloned.

    You can keep a Java UDF and its referenced stage(s) in separate schemas (and/or separate databases) the following ways:

    • Wherever the Java UDF references a stage, use a qualified stage name (e.g. “my_db.my_schema.my_stage()”) different from the schema or database of the Java UDF. If the cloning operation clones a database, the stage reference should include the database and schema. If the cloning operation clones a schema, the stage reference should include the schema (and optionally the database).

    • Create the referenced stage by using a non-qualified stage name (which implicitly uses the current session’s active database and schema), and create the Java UDF by using a qualified name that does not match the session’s current database and schema.

    • Use the user’s stage as the referenced stage (the user’s stage is separate from any database’s stage or schema’s stage).

If one or more Java UDFs in the schema or database do not meet the required conditions, the schema or database can still be cloned, but the non-compliant Java UDFs are omitted from the clone without any error or warning message.

Each cloned Java UDF has the same definition as the original. That definition includes any references to stages. The stage references in the Java UDF must be fully-qualified, and therefore are absolute, not relative to the schema or database being cloned. Because both the original and the clone point to the same stage(s) and file(s):

  • Dropping the stage or removing required files from the stage disables both the original and cloned UDF.

  • Altering the stage or the files on the stage (e.g. replacing the JAR file with a newer JAR file) affects both the original and cloned UDF.

Back to top