CREATE FUNCTION¶
Creates a new UDF (user-defined function). The function can return either scalar results (as a UDF) or tabular results (as a UDTF).
When you create a UDF, you specify a handler whose code is written in one of the supported languages. Depending on the handler’s language, you can either include the handler source code in-line with the CREATE FUNCTION statement or reference the handler’s location from CREATE FUNCTION, where the handler is precompiled or source code on a stage.
The following table lists supported languages.
Language |
Pre-Compiled or Staged Source |
In-line |
---|---|---|
Java |
Yes (precompiled) |
Yes |
JavaScript |
No |
Yes |
Python |
Yes (staged source) |
Yes |
SQL |
No |
Yes |
- See also:
ALTER FUNCTION, DROP FUNCTION, SHOW USER FUNCTIONS , DESCRIBE FUNCTION
Syntax¶
The syntax for CREATE FUNCTION varies depending on which language you’re using as the UDF handler.
Java Handler
Use the syntax below if the source code is in-line:
CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] [ SECURE ] FUNCTION [ IF NOT EXISTS ] <name> ( [ <arg_name> <arg_data_type> ] [ , ... ] )
RETURNS { <result_data_type> | TABLE ( <col_name> <col_data_type> [ , ... ] ) }
[ [ NOT ] NULL ]
LANGUAGE JAVA
[ { CALLED ON NULL INPUT | { RETURNS NULL ON NULL INPUT | STRICT } } ]
[ VOLATILE | IMMUTABLE ]
[ RUNTIME_VERSION = <java_jdk_version> ]
[ COMMENT = '<string_literal>' ]
[ IMPORTS = ( '<stage_path_and_file_name_to_read>' [ , ... ] ) ]
[ PACKAGES = ( '<package_name_and_version>' [ , ... ] ) ]
HANDLER = '<path_to_method>'
[ TARGET_PATH = '<stage_path_and_file_name_to_write>' ]
AS '<function_definition>'
Use the syntax below if the source code is pre-compiled (i.e. if the code is provided as a JAR file):
CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] [ SECURE ] FUNCTION [ IF NOT EXISTS ] <name> ( [ <arg_name> <arg_data_type> ] [ , ... ] )
RETURNS { <result_data_type> | TABLE ( <col_name> <col_data_type> [ , ... ] ) }
[ [ NOT ] NULL ]
LANGUAGE JAVA
[ { CALLED ON NULL INPUT | { RETURNS NULL ON NULL INPUT | STRICT } } ]
[ VOLATILE | IMMUTABLE ]
[ RUNTIME_VERSION = <java_jdk_version> ]
[ COMMENT = '<string_literal>' ]
IMPORTS = ( '<stage_path_and_file_name_to_read>' [ , ... ] )
HANDLER = '<path_to_method>'
JavaScript Handler
CREATE [ OR REPLACE ] [ SECURE ] FUNCTION <name> ( [ <arg_name> <arg_data_type> ] [ , ... ] )
RETURNS { <result_data_type> | TABLE ( <col_name> <col_data_type> [ , ... ] ) }
[ [ NOT ] NULL ]
LANGUAGE JAVASCRIPT
[ { CALLED ON NULL INPUT | { RETURNS NULL ON NULL INPUT | STRICT } } ]
[ VOLATILE | IMMUTABLE ]
[ COMMENT = '<string_literal>' ]
AS '<function_definition>'
Python Handler
Use the syntax below if the source code is in-line:
CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] [ SECURE ] FUNCTION [ IF NOT EXISTS ] <name> ( [ <arg_name> <arg_data_type> ] [ , ... ] )
RETURNS { <result_data_type> | TABLE ( <col_name> <col_data_type> [ , ... ] ) }
[ [ NOT ] NULL ]
LANGUAGE PYTHON
[ { CALLED ON NULL INPUT | { RETURNS NULL ON NULL INPUT | STRICT } } ]
[ VOLATILE | IMMUTABLE ]
RUNTIME_VERSION = <python_version>
[ COMMENT = '<string_literal>' ]
[ IMPORTS = ( '<stage_path_and_file_name_to_read>' [ , ... ] ) ]
[ PACKAGES = ( '<package_name>[==<version>]' [ , ... ] ) ]
HANDLER = '<function_name>'
AS '<function_definition>'
Use the syntax below if you’re referencing the source code at a stage location:
CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] FUNCTION [ IF NOT EXISTS ] <name> ( [ <arg_name> <arg_data_type> ] [ , ... ] )
RETURNS { <result_data_type> | TABLE ( <col_name> <col_data_type> [ , ... ] ) }
[ [ NOT ] NULL ]
LANGUAGE PYTHON
[ { CALLED ON NULL INPUT | { RETURNS NULL ON NULL INPUT | STRICT } } ]
[ VOLATILE | IMMUTABLE ]
RUNTIME_VERSION = <python_version>
[ COMMENT = '<string_literal>' ]
IMPORTS = ( '<stage_path_and_file_name_to_read>' [ , ... ] )
[ PACKAGES = ( '<package_name>[==<version>]' [ , ... ] ) ]
HANDLER = '<module_file_name>.<function_name>'
SQL Handler
CREATE [ OR REPLACE ] [ SECURE ] FUNCTION <name> ( [ <arg_name> <arg_data_type> ] [ , ... ] )
RETURNS { <result_data_type> | TABLE ( <col_name> <col_data_type> [ , ... ] ) }
[ [ NOT ] NULL ]
[ { CALLED ON NULL INPUT | { RETURNS NULL ON NULL INPUT | STRICT } } ]
[ VOLATILE | IMMUTABLE ]
[ MEMOIZABLE ]
[ COMMENT = '<string_literal>' ]
AS '<function_definition>'
Required Parameters¶
All Languages
name ( [ arg_name arg_data_type ] [ , ... ] )
Specifies the identifier (and optionally one or more arguments/inputs) for the UDF. The identifier does not need to be unique for the schema in which the UDF is created because UDFs are identified and resolved by their name and argument types.
However, the identifier must start with an alphabetic character and cannot contain spaces or special characters unless the entire identifier string is enclosed in double quotes (e.g.
"My object"
). Identifiers enclosed in double quotes are also case-sensitive.For more details, see Identifier Requirements.
RETURNS ...
Specifies the results returned by the UDF, which determines the UDF type:
result_data_type
: Creates a scalar UDF that returns a single value with the specified data type.TABLE ( col_name col_data_type , ... )
: Creates a table UDF that returns tabular results with the specified table column(s) and column type(s).
Note
For Java UDFs, the
result_data_type
must be in theSQL Data Type
column of the SQL-Java Type Mappings table.For Python UDFs, the
result_data_type
must be in theSQL Data Type
column of the SQL-Python Type Mappings table.
AS function_definition
Defines the handler code executed when the UDF is called. The
function_definition
value must be source code in one of the languages supported for handlers. The code may be:Java. For more information, see Writing Java UDFs.
JavaScript. For more information, see Writing JavaScript UDFs.
Python. For more information, see Writing Python UDFs.
A SQL expression. For more information, see Writing SQL UDFs.
For more details, see Usage Notes (in this topic).
Note
The AS clause is not required when the UDF handler code is referenced on a stage with the IMPORTS clause.
Java
LANGUAGE JAVA
Specifies that the code is in the Java language.
RUNTIME_VERSION = java_jdk_version
Specifies the Java JDK runtime version to use. The supported versions of Java are:
11.x
If
RUNTIME_VERSION
is not set, Java JDK 11 is used.IMPORTS = (stage_path_and_file_name_to_read)
The location (stage), path, and name of the file(s) to import.
A file can be a JAR file or another type of file.
If the file is a JAR file, it can contain one or more .class files and zero or more resource files.
JNI (Java Native Interface) is not supported. Snowflake prohibits loading libraries that contain native code (as opposed to Java bytecode).
Java UDFs can also read non-JAR files. For an example, see Reading a File Using IMPORTS.
If you plan to copy a file (JAR file or other file) to a stage, then Snowflake recommends using a named internal stage because the
PUT
command supports copying files to named internal stages, and thePUT
command is usually the easiest way to move a JAR file to a stage.External stages are allowed, but are not supported by
PUT
.Each file in the
IMPORTS
clause must have a unique name, even if the files are in different subdirectories or different stages.If both the
IMPORTS
andTARGET_PATH
clauses are present, the file name in theTARGET_PATH
clause must be different from each file name in theIMPORTS
clause, even if the files are in different subdirectories or different stages.Snowflake returns an error if the
TARGET_PATH
matches an existing file; you cannot useTARGET_PATH
to overwrite an existing file.For a pre-compiled Java UDF, the
IMPORTS
clause is required because it specifies the location of the JAR file that contains the UDF.For an in-line Java UDF, the
IMPORTS
clause is needed only if the in-line Java UDF needs to access other files, such as libraries or text files.For Snowflake system packages, such the Snowpark package, you can specify the package with the
PACKAGES
clause rather than specifying its JAR file withIMPORTS
. When you do, the package JAR file need not be included in anIMPORTS
value.In-line Java
function_definition
In-line Java UDFs require a function definition.
JavaScript
LANGUAGE JAVASCRIPT
Specifies that the code is in the JavaScript language.
Python
LANGUAGE PYTHON
Specifies that the code is in the Python language.
RUNTIME_VERSION = python_version
Specifies the Python version to use. The supported versions of Python are:
3.8
IMPORTS = (stage_path_and_file_name_to_read)
The location (stage), path, and name of the file(s) to import.
A file can be a .py file or another type of file.
Python UDFs can also read non-Python files, such as text files. For an example, see Reading Files with a UDF Handler.
If you plan to copy a file to a stage, then Snowflake recommends using a named internal stage because the
PUT
command supports copying files to named internal stages, and thePUT
command is usually the easiest way to move a file to a stage.External stages are allowed, but are not supported by
PUT
.Each file in the
IMPORTS
clause must have a unique name, even if the files are in different subdirectories or different stages.When the handler code is stored in a stage, you must use the
IMPORTS
clause to specify the handler code’s location.For an in-line Python UDF, the
IMPORTS
clause is needed only if the UDF handler needs to access other files, such as packages or text files.For packages included on the Snowflake system, such numpy, you can specify the package with the
PACKAGES
clause alone, omitting the package’s source as anIMPORTS
value.HANDLER = handler_name
The name of the handler function or class.
If the handler is for a scalar UDF, returning a non-tabular value, the HANDLER value should be a function name. If the handler code is in-line with the CREATE FUNCTION statement, you can use the function name alone. When the handler code is referenced at a stage, this value should be qualified with the module name, as in the following form:
my_module.my_function
.If the handler is for a tabular UDF, the HANDLER value should be the name of a handler class.
Optional Parameters¶
All Languages
SECURE
Specifies that the function is secure. For more information about secure functions, see Protecting Sensitive Information with Secure UDFs and Stored Procedures.
[ [ NOT ] NULL ]
Specifies whether the function can return NULL values or must return only NON-NULL values. The default is NULL (i.e. the function can return NULL).
Note
Currently, the
NOT NULL
clause is not enforced for SQL UDFs. SQL UDFs declared asNOT NULL
can return NULL values. Snowflake recommends avoidingNOT NULL
for SQL UDFs unless the code in the function is written to ensure that NULL values are never returned.CALLED ON NULL INPUT
or .RETURNS NULL ON NULL INPUT | STRICT
Specifies the behavior of the UDF when called with null inputs. In contrast to system-defined functions, which always return null when any input is null, UDFs can handle null inputs, returning non-null values even when an input is null:
CALLED ON NULL INPUT
will always call the UDF with null inputs. It is up to the UDF to handle such values appropriately.RETURNS NULL ON NULL INPUT
(or its synonymSTRICT
) will not call the UDF if any input is null. Instead, a null value will always be returned for that row. Note that the UDF might still return null for non-null inputs.
Default:
CALLED ON NULL INPUT
VOLATILE | IMMUTABLE
Specifies the behavior of the UDF when returning results:
VOLATILE
: UDF might return different values for different rows, even for the same input (e.g. due to non-determinism and statefullness).IMMUTABLE
: UDF assumes that the function, when called with the same inputs, will always return the same result. This guarantee is not checked. SpecifyingIMMUTABLE
for a UDF that returns different values for the same input will result in undefined behavior.
Default:
VOLATILE
COMMENT = 'string_literal'
Specifies a comment for the UDF, which is displayed in the DESCRIPTION column in the SHOW FUNCTIONS and SHOW USER FUNCTIONS output.
Default:
user-defined function
Java
PACKAGES = (package_name_and_version)
The name and version number of Snowflake system packages required as dependencies. The value should be of the form
package_name:version_number
, wherepackage_name
issnowflake_domain:package
. Note that you can specifylatest
as the version number in order to have Snowflake use the latest version available on the system.For example:
-- Use version 1.2.0 of the Snowpark package. packages=('com.snowflake:snowpark:1.2.0') -- Use the latest version of the Snowpark package. packages=('com.snowflake:snowpark:latest')
You can discover the list of supported system packages by executing the following SQL in Snowflake:
select * from information_schema.packages where language ='java';
For a dependency you specify with
PACKAGES
, you do not need to also specify its JAR file in anIMPORTS
clause.In-line Java
TARGET_PATH = stage_path_and_file_name_to_write
The
TARGET_PATH
clause specifies the location to which Snowflake should write the compiled code (JAR file) after compiling the source code specified in thefunction_definition
.If this clause is included, the user should manually remove the JAR file when it is no longer needed (typically when the Java UDF is dropped).
If this clause is omitted, Snowflake re-compiles the source code each time the code is needed. The JAR file is not stored permanently, and the user does not need to clean up the JAR file.
Python
PACKAGES = (package_name[==package_version])
The name and version number of packages required as dependencies. The value should be of the form
package_name==version_number
. If you omit the version number, Snowflake will use the latest package available on the system.For example:
-- Use version 1.2.2 of the NumPy package. packages=('numpy==1.2.2') -- Use the latest version of the NumPy package. packages=('numpy')
You can discover the list of supported system packages by executing the following SQL in Snowflake:
select * from information_schema.packages where language ='python';
For more information about included packages, see Using Third-Party Packages.
SQL
MEMOIZABLE
Specifies that the function is memoizable.
For details, see Memoizable UDFs.
Usage Notes¶
All Languages
function_definition
has size restrictions. The maximum allowable size is subject to change.The delimiters around the
function_definition
can be either single quotes or a pair of dollar signs.Using
$$
as the delimiter makes it easier to write stored procedures that contain single quotes.If the delimiter for the body of the function is the single quote character, then any single quotes within
function_definition
(e.g. string literals) must be escaped by single quotes.If using a UDF in a masking policy, ensure the data type of the column, UDF, and masking policy match. For more information, see User-defined Functions in a Masking Policy.
Regarding metadata:
Attention
Customers should ensure that no personal data (other than for a User object), sensitive data, export-controlled data, or other regulated data is entered as metadata when using the Snowflake service. For more information, see Metadata Fields in Snowflake.
CREATE OR REPLACE <object> statements are atomic. That is, when the object is replaced, the old object deletion and the new object creation are processed in a single transaction.
Java
In Java, primitive data types don’t allow NULL values, so passing a NULL for an argument of such a type results in an error.
In the
HANDLER
clause, the method name is case-sensitive.In the
IMPORTS
andTARGET_PATH
clauses:Package, class, and file name(s) are case-sensitive.
Stage name(s) are case-insensitive.
You can use the
PACKAGES
clause to specify package name and version number for Snowflake system-defined dependencies, such as those from Snowpark. For other dependencies, specify dependency JAR files with theIMPORTS
clause.Snowflake validates that:
The JAR file specified in the
CREATE FUNCTION
statement’sHANDLER
exists and contains the specified class and method.The input and output types specified in the UDF declaration are compatible with the input and output types of the Java method.
Validation can be done at creation time or execution time.
If the user is connected to an active Snowflake warehouse at the time the
CREATE FUNCTION
statement is executed, then the UDF is validated at creation time.Otherwise, the UDF is created, but is not validated immediately, and Snowflake returns the following message:
Function <name> created successfully, but could not be validated since there is no active warehouse
.
JavaScript
Snowflake does not validate JavaScript code at UDF creation time (i.e. creation of the UDF succeeds regardless of whether the code is valid). If the code is not valid, errors will be returned when the UDF is called at query time.
Python
In the
HANDLER
clause, the handler function name is case-sensitive.In the
IMPORTS
clause:File name(s) are case-sensitive.
Stage name(s) are case-insensitive.
You can use the
PACKAGES
clause to specify package name and version number for dependencies, such as those from Snowpark. For other dependencies, specify dependency files with theIMPORTS
clause.Snowflake validates that:
The function or class specified in the
CREATE FUNCTION
statement’sHANDLER
exists.The input and output types specified in the UDF declaration are compatible with the input and output types of the handler.
Validation can be done at creation time or execution time.
If the user is connected to an active Snowflake warehouse at the time the
CREATE FUNCTION
statement is executed, then the UDF is validated at creation time.Otherwise, the UDF is created, but is not validated immediately, and Snowflake returns the following message:
Function <name> created successfully, but could not be validated since there is no active warehouse
.
SQL
Currently, the
NOT NULL
clause is not enforced for SQL UDFs.
Examples¶
Java
Here is a basic example of an in-line CREATE FUNCTION statement:
create or replace function echo_varchar(x varchar)
returns varchar
language java
called on null input
handler='TestFunc.echoVarchar'
target_path='@~/testfunc.jar'
as
'class TestFunc {
public static String echoVarchar(String x) {
return x;
}
}';
Here is a basic example of a pre-compiled CREATE FUNCTION statement:
create function my_decrement_udf(i numeric(9, 0))
returns numeric
language java
imports = ('@~/my_decrement_udf_package_dir/my_decrement_udf_jar.jar')
handler = 'my_decrement_udf_package.my_decrement_udf_class.my_decrement_udf_method'
;
For more examples of Java UDFs, see examples.
JavaScript
Create a JavaScript UDF named js_factorial
:
CREATE OR REPLACE FUNCTION js_factorial(d double)
RETURNS double
LANGUAGE JAVASCRIPT
STRICT
AS '
if (D <= 0) {
return 1;
} else {
var result = 1;
for (var i = 2; i <= D; i++) {
result = result * i;
}
return result;
}
';
Python
Code in the following example creates a py_udf
function whose handler code is in-line as udf
.
create or replace function py_udf()
returns variant
language python
runtime_version = '3.8'
packages = ('numpy','pandas','xgboost==1.5.0')
handler = 'udf'
as $$
import numpy as np
import pandas as pd
import xgboost as xgb
def udf():
return [np.__version__, pd.__version__, xgb.__version__]
$$;
Code in the following example creates a dream
function whose handler is in a sleepy.py
file located on the
@my_stage
stage.
create or replace function dream(i int)
returns variant
language python
runtime_version = '3.8'
handler = 'sleepy.snore'
imports = ('@my_stage/sleepy.py')
SQL
Create a simple SQL scalar UDF that returns a hard-coded approximation of the mathematical constant pi:
CREATE FUNCTION pi_udf()
RETURNS FLOAT
AS '3.141592654::FLOAT'
;
Create a simple SQL table UDF that returns hard-coded values:
CREATE FUNCTION simple_table_function ()
RETURNS TABLE (x INTEGER, y INTEGER)
AS
$$
SELECT 1, 2
UNION ALL
SELECT 3, 4
$$
;
SELECT * FROM TABLE(simple_table_function());
Output:
SELECT * FROM TABLE(simple_table_function());
+---+---+
| X | Y |
|---+---|
| 1 | 2 |
| 3 | 4 |
+---+---+
Create a UDF that accepts multiple parameters:
CREATE FUNCTION multiply1 (a number, b number)
RETURNS number
COMMENT='multiply two numbers'
AS 'a * b';
Create a SQL table UDF named get_countries_for_user
that returns the results of a query:
CREATE OR REPLACE FUNCTION get_countries_for_user ( id number )
RETURNS TABLE (country_code char, country_name varchar)
AS 'select distinct c.country_code, c.country_name
from user_addresses a, countries c
where a.user_id = id
and c.country_code = a.country_code';