Overview of UDFs

This topic covers concepts and usage details that apply to UDFs (user-defined functions).

In this Topic:

Scalar vs Tabular UDFs

A scalar UDF returns one output row for each input row. The returned row consists of a single column/value.

A tabular UDF returns zero, one, or multiple rows for each input row. Each returned row can contain one or more columns. The returned rows can be accessed in the FROM clause of a query. A tabular UDF is defined by specifying a return clause that contains the TABLE keyword and specifies the names and types of the columns in the table results.

Tabular UDFs are often called UDTFs (user-defined table functions) or table UDFs.

For more details about creating and using UDTFs, including examples, see Tabular SQL UDFs (UDTFs) and Tabular JavaScript UDFs (UDTFs). For more general information about the system-defined table functions provided in Snowflake, see Table Functions.

Supported Programming Languages for Creating UDFs

Snowflake supports the following programming languages:

SQL

A SQL UDF evaluates an arbitrary SQL expression and returns the result of the expression.

The function definition can be a SQL expression that returns either a scalar (i.e. single) value or, if defined as a table function, a set of rows.

The expression can be a query expression, though for non-tabular functions, the query expression must be guaranteed to return at most one row, containing a single column.

The expression defining a UDF can refer to the input arguments of the function, and to database objects such as tables, views, and sequences. The UDF owner must have appropriate privileges on any database objects that the UDF accesses. A SQL UDF’s defining expression can refer to other user-defined functions, though it cannot refer recursively to itself, either directly or through another function calling back to it.

JavaScript

JavaScript UDFs allow you to manipulate data using the JavaScript programming language and runtime environment. JavaScript UDFs are created in the same way as SQL UDFs, but with the LANGUAGE parameter set to JAVASCRIPT.

Similar to SQL UDFs, JavaScript UDFs can return either a scalar or a tabular result, depending on how the UDF is defined.

A JavaScript UDF’s defining expression can refer recursively to itself, but cannot refer to other user-defined functions.

Note

JavaScript UDFs have some requirements, usage details, and limitations that don’t apply to SQL UDFs. For more details, including examples, see JavaScript UDFs.

Java

Java UDFs allow you to manipulate data using the Java programming language and runtime environment.

Java UDFs return a scalar result. Java UDTFs are not currently supported.

You can create Java UDFs in either of two ways:

  • In-line: You type in the Java code as part of the CREATE FUNCTION command.

  • JAR file: You specify the location of an already-compiled JAR file as part of the CREATE FUNCTION command.

Each of these is documented in more detail later.

Note

Java UDFs have some requirements, usage details, and limitations that don’t apply to SQL UDFs. For more details, see:

Choosing Your Programming Language

Many factors might affect which programming language you use to write a UDF. Factors might include:

  • Whether you already have code in a particular language. For example, if you already have a .jar file that contains a method that does the work you need, then you might prefer Java as your programming language.

  • The capabilities of the language.

  • Whether a language has libraries that can help you do the processing that you need to do.

The following table summarizes key capabilities of each UDF programming language. The list below describes the columns of the table:

  • Tabular: Indicates whether Snowflake supports writing table functions (in addition to scalar functions).

  • Looping and Branching: Indicates whether the programming language supports looping and branching.

  • Pre-compiled: Indicates whether you can supply the code for your UDF as a compiled file (e.g. as a .JAR file).

  • In-line: Indicates whether you can provide the code as text in the CREATE FUNCTION statement.

  • Sharable: Indicates whether UDFs written in this language can be used with the Snowflake Secure Data Sharing feature.

Language

Tabular

Looping and Branching

Pre-Compiled

In-line

Sharable

SQL

Yes

No

No

Yes

Yes

JavaScript

Yes

Yes

No

Yes

Java

No

Yes

Yes

Yes

No 1

1

For more information about limits on sharing Java UDFs, see Limitations on Java UDFs.

Naming Conventions for UDFs

UDFs are database objects, meaning that they are created in a specified database and schema. As such, they have a fully-qualified name defined by their namespace, in the form of db.schema.function_name, for example:

SELECT temporary_db_qualified_names_test.temporary_schema_1.udf_pi();

When called without their fully-qualified name, UDFs are resolved according to the database and schema in use for the session.

This is in contrast to the built-in, system-defined functions provided by Snowflake, which have no namespace and, therefore, can be called from anywhere.

Conflicts with System-defined Functions

To avoid conflicts when calling functions, Snowflake does not allow creating UDFs with the same name as any of the system-defined functions.

Overloading of UDF Names

Snowflake supports overloading of SQL UDF names. Multiple SQL UDFs in the same schema can have the same name, as long as their argument signatures differ, either by the number of arguments or the argument types. When an overloaded UDF is called, Snowflake checks the arguments and calls the correct function.

Consider the following examples, which create two SQL UDFs named add5:

CREATE OR REPLACE FUNCTION add5 (n number)
  RETURNS number
  AS 'n + 5';

CREATE OR REPLACE FUNCTION add5 (s string)
  RETURNS string
  AS 's || ''5''';

Important

In the second ADD5 function, single quotes are used to escape the string literal '5'. Any single quotes used in a UDF definition must be escaped by single quotes.

If add5 is called with a numeric argument, then the first implementation is chosen, while a string-typed argument uses the second implementation. If the argument is neither a number nor a string, then the implementation depends on Snowflake’s implicit type conversion rules. For instance, a date-typed argument would be converted to a string, and the string implementation would be selected, because conversion from DATE to NUMBER is not supported.

For example:

select add5(1);

+---------+
| ADD5(1) |
|---------|
|       6 |
+---------+

select add5('1');

+-----------+
| ADD5('1') |
|-----------|
| 15        |
+-----------+

select add5('hello');

+---------------+
| ADD5('HELLO') |
|---------------|
| hello5        |
+---------------+

select add5(to_date('2014-01-01'));

+-----------------------------+
| ADD5(TO_DATE('2014-01-01')) |
|-----------------------------|
| 2014-01-015                 |
+-----------------------------+

Be careful when using overloading. The combination of automatic type conversion and overloading makes it easy for a minor user error to cause unexpected results. For an example, see Overloading Function Names.

Caution

For UDFs written in languages other than SQL, the rules might differ.

Examples

Overloading Function Names

As described in Overloading of UDF Names (in this topic), you can overload function names.

The next example of overloading shows how combining overloading and automatic type conversion can make it easy to get unexpected results:

Create a function that takes a FLOAT parameter:

CREATE FUNCTION add_pi(PARAM_1 FLOAT)
    RETURNS FLOAT
    LANGUAGE SQL
    AS $$
        PARAM_1 + 3.1415926::FLOAT
    $$;

Call the function twice. The first time, pass a FLOAT. The second time, pass a VARCHAR. The VARCHAR is converted to a FLOAT, and the output from each call is identical:

SELECT add_pi(1.0), add_pi('1.0');
+-------------+---------------+
| ADD_PI(1.0) | ADD_PI('1.0') |
|-------------+---------------|
|   4.1415926 |     4.1415926 |
+-------------+---------------+

Now, create an overloaded function that takes a VARCHAR parameter:

CREATE FUNCTION add_pi(PARAM_1 VARCHAR)
    RETURNS VARCHAR
    LANGUAGE SQL
    AS $$
        PARAM_1 || ' + 3.1415926'
    $$;

Now use exact the same CALLs as before. Note the difference in output between these two CALLs and the previous two CALLs.

SELECT add_pi(1.0), add_pi('1.0');
+-------------+-----------------+
| ADD_PI(1.0) | ADD_PI('1.0')   |
|-------------+-----------------|
|   4.1415926 | 1.0 + 3.1415926 |
+-------------+-----------------+