Overview of UDFs¶

This topic covers concepts and usage details that apply to UDFs (user-defined functions).

In this Topic:

Scalar vs Tabular UDFs¶

UDFs may be scalar or tabular.

A scalar function returns one output row for each input row. The returned row consists of a single column/value.

For more details about creating and using scalar UDFs, see:

A tabular function, also called a table function, returns zero, one, or multiple rows for each input row. A tabular UDF is defined by specifying a return clause that contains the TABLE keyword and specifies the names and data types of the columns in the table results. Tabular UDFs are often called UDTFs (user-defined table functions) or table UDFs.

For more details about creating and using UDTFs, including examples, see:

Supported Programming Languages for Creating UDFs¶

Snowflake supports the following programming languages:

SQL¶

A SQL UDF can be either scalar or tabular.

A SQL UDF evaluates an arbitrary SQL expression and returns the result of the expression.

The expression can be either a general expression or a query expression:

• This UDF uses the general expression pi() * radius * radius:

CREATE FUNCTION area_of_circle(radius FLOAT)
RETURNS FLOAT
AS
$$pi() * radius * radius$$
;

• This UDF uses a query expression:

CREATE FUNCTION profit()
RETURNS NUMERIC(11, 2)
AS
$$SELECT SUM((retail_price - wholesale_price) * number_sold) FROM purchases$$
;


For non-tabular UDFs, the query expression must be guaranteed to return at most one row, containing one column.

The expression defining a UDF can refer to the input arguments of the function, as shown in the examples above.

The expression defining a UDF can refer to database objects such as tables, views, and sequences, as shown in the examples above. However, the following restrictions apply:

• The UDF owner must have appropriate privileges on any database objects that the UDF accesses.

• Database objects cannot be referred to using dynamic SQL. For example, the following statement fails because IDENTIFIER(table_name_parameter) is not allowed:

CREATE OR REPLACE FUNCTION profit2(table_name_parameter VARCHAR)
RETURNS NUMERIC(11, 2)
AS
$$SELECT SUM((retail_price - wholesale_price) * number_sold) FROM IDENTIFIER(table_name_parameter)$$
;


The expression defining a UDF should not contain a semicolon as a statement terminator. For example, note that the SELECT in the query expression UDF above is not terminated with a semicolon.

A SQL UDF’s defining expression can refer to other user-defined functions. However, it cannot refer recursively to itself, either directly or indirectly (i.e. through another function calling back to it).

JavaScript¶

JavaScript UDFs allow you to manipulate data using the JavaScript programming language and runtime environment. JavaScript UDFs are created in the same way as SQL UDFs, but with the LANGUAGE parameter set to JAVASCRIPT.

Similar to SQL UDFs, JavaScript UDFs can return either a scalar or a tabular result, depending on how the UDF is defined.

A JavaScript UDF’s defining expression can refer recursively to itself, but cannot refer to other user-defined functions.

Note

JavaScript UDFs have some requirements, usage details, and limitations that don’t apply to SQL UDFs. For more details, including examples, see JavaScript UDFs.

Java¶

Java UDFs allow you to manipulate data using the Java programming language and runtime environment.

Similar to SQL and JavaScript UDFs, Java UDFs can return either a scalar or a tabular result, depending on how the UDF is defined.

You can create Java UDFs in either of two ways:

• In-line: You type in the Java code as part of the CREATE FUNCTION command.

• JAR file: You specify the location of an already-compiled JAR file as part of the CREATE FUNCTION command.

Note

Java UDFs have some requirements, usage details, and limitations that don’t apply to SQL UDFs. For more details, see:

Python¶

Python UDFs allow you to manipulate data using the Python programming language and runtime environment.

Similar to Java UDFs, Python UDFs can return either a scalar or a tabular result, depending on how the UDF is defined.

You can create Python UDFs in either of two ways:

• In-line: You type in the Python code as part of the CREATE FUNCTION command.

• As a module copied to a stage: You specify the location of the module file as part of the CREATE FUNCTION command.

Note

Python UDFs have some requirements, usage details, and limitations that don’t apply to SQL UDFs. For more details, see:

Many factors might affect which programming language you use to write a UDF. Factors might include:

• Whether you already have code in a particular language. For example, if you already have a .jar file that contains a method that does the work you need, then you might prefer Java as your programming language.

• The capabilities of the language.

• Whether a language has libraries that can help you do the processing that you need to do.

The following table summarizes key capabilities of each UDF programming language. The list below describes the columns of the table:

• Tabular: Indicates whether Snowflake supports writing table functions (in addition to scalar functions).

• Looping and Branching: Indicates whether the programming language supports looping and branching.

• Pre-compiled: Indicates whether you can supply the code for your UDF as a compiled file (e.g. as a .JAR file).

• In-line: Indicates whether you can provide the code as text in the CREATE FUNCTION statement.

• Sharable: Indicates whether UDFs written in this language can be used with the Snowflake Secure Data Sharing feature.

Language

Tabular

Looping and Branching

Pre-Compiled or Staged Source

In-line

Sharable

SQL

Yes

No

No

Yes

Yes

JavaScript

Yes

Yes

No

Yes

Yes

Java

Yes

Yes

Yes (precompiled)

Yes

No 1

Python

Yes

Yes

Yes (staged)

Yes

No 2

1

2

Naming Conventions for UDFs¶

UDFs are database objects, meaning that they are created in a specified database and schema. As such, they have a fully-qualified name defined by their namespace, in the form of db.schema.function_name, for example:

SELECT temporary_db_qualified_names_test.temporary_schema_1.udf_pi();


When called without their fully-qualified name, UDFs are resolved according to the database and schema in use for the session.

This is in contrast to the built-in, system-defined functions provided by Snowflake, which have no namespace and, therefore, can be called from anywhere.

Conflicts with System-defined Functions¶

To avoid conflicts when calling functions, Snowflake does not allow creating UDFs with the same name as any of the system-defined functions.

Snowflake supports overloading of SQL UDF names. Multiple SQL UDFs in the same schema can have the same name, as long as their argument signatures differ, either by the number of arguments or the argument types. When an overloaded UDF is called, Snowflake checks the arguments and calls the correct function.

Consider the following examples, which create two SQL UDFs named add5:

CREATE OR REPLACE FUNCTION add5 (n number)
RETURNS number
AS 'n + 5';

CREATE OR REPLACE FUNCTION add5 (s string)
RETURNS string
AS 's || ''5''';


Important

In the second ADD5 function, single quotes are used to escape the string literal '5'. Any single quotes used in a UDF definition must be escaped by single quotes.

If add5 is called with a numeric argument, then the first implementation is chosen, while a string-typed argument uses the second implementation. If the argument is neither a number nor a string, then the implementation depends on Snowflake’s implicit type conversion rules. For instance, a date-typed argument would be converted to a string, and the string implementation would be selected, because conversion from DATE to NUMBER is not supported.

For example:

select add5(1);

+---------+
|---------|
|       6 |
+---------+

+-----------+
|-----------|
| 15        |
+-----------+

+---------------+
|---------------|
| hello5        |
+---------------+

+-----------------------------+
|-----------------------------|
| 2014-01-015                 |
+-----------------------------+


Caution

For UDFs written in languages other than SQL, the rules might differ.

Examples¶

Create a function that takes a FLOAT parameter:

CREATE FUNCTION add_pi(PARAM_1 FLOAT)
RETURNS FLOAT
LANGUAGE SQL
AS $$PARAM_1 + 3.1415926::FLOAT$$;


Call the function twice. The first time, pass a FLOAT. The second time, pass a VARCHAR. The VARCHAR is converted to a FLOAT, and the output from each call is identical:

SELECT add_pi(1.0), add_pi('1.0');
+-------------+---------------+
|-------------+---------------|
|   4.1415926 |     4.1415926 |
+-------------+---------------+


Now, create an overloaded function that takes a VARCHAR parameter:

CREATE FUNCTION add_pi(PARAM_1 VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
AS $$PARAM_1 || ' + 3.1415926'$$;


Now use exact the same CALLs as before. Note the difference in output between these two CALLs and the previous two CALLs.

SELECT add_pi(1.0), add_pi('1.0');
+-------------+-----------------+