Calling a UDF¶
You can call a user-defined function (UDF) or user-defined table function (UDTF) in the same way that you call other functions.
Calling a UDF¶
In general, you call a UDF same way that you call other functions.
Code in the following example calls a UDF called
my_table as an argument.
SELECT my_udf(column_1) FROM my_table;
Calling a UDTF¶
You can call a UDTF the way you would call any table function. When calling a UDTF in the FROM clause of a query, specify the UDTF’s name and arguments inside the parentheses that follow the TABLE keyword.
In other words, use a form such as the following for the TABLE keyword when calling a UDTF:
SELECT ... FROM TABLE ( udtf_name (udtf_arguments) )
Code in the following example calls the
my_java_udtf table function, specifying a DATE literal in the argument
SELECT ... FROM TABLE(my_java_udtf('2021-01-16'::DATE));
The argument to a table function can be an expression, not just a literal. For example, a table function can be called using a column from a table. Some examples are below, including in the Examples section.
For more information about table functions in general, see table function.
You cannot call a UDF within the DEFAULT clause of a CREATE TABLE statement.
Using a Table or UDTF as Input to a UDTF¶
The input to a table function can come from a table or from another UDTF, as documented in Using a Table as Input to a Table Function.
The example below shows how to use a table to provide input to the UDTF
create table file_names (file_name varchar); insert into file_names (file_name) values ('sample.txt'), ('sample_2.txt'); select f.file_name, w.word from file_names as f, table(split_file_into_words(f.file_name)) as w;
The output looks similar to the following:
+-------------------+------------+ | FILE_NAME | WORD | +-------------------+------------+ | sample_data.txt | some | | sample_data.txt | words | | sample_data_2.txt | additional | | sample_data_2.txt | words | +-------------------+------------+
The IMPORTS clause of the UDTF must specify the name and path of each file passed to the UDTF. For example:
create function split_file_into_words(inputFileName string) ... imports = ('@inline_jars/sample.txt', '@inline_jars/sample_2.txt') ...
Each file must already have been copied to a stage (in this case, the stage named
@inline_jars) before the UDTF reads the file.
Table Functions and Partitions¶
Before rows are passed to table functions, the rows can be grouped into partitions. Partitioning has two main benefits:
Partitioning allows Snowflake to divide up the workload to improve parallelization and thus performance.
Partitioning allows Snowflake to process all rows with a common characteristic as a group. You can return results that are based on all rows in the group, not just on individual rows.
For example, you might partition stock price data into one group per stock. All stock prices for an individual company can be analyzed together, while stock prices for each company can be analyzed independently of any other company.
Data can be partitioned explicitly or implicitly.
Explicit Partitioning into Multiple Groups
The following statement calls the UDTF named
my_udtf on individual partitions. Each partition contains all rows for which
PARTITION BY expression evaluates to the same value (e.g. the same company or stock symbol).
SELECT * FROM stocks_table AS st, TABLE(my_udtf(st.symbol, st.transaction_date, st.price) OVER (PARTITION BY st.symbol))
Explicit Partitioning into a Single Group
The following statement calls the UDTF named
my_udtf on one partition. The
PARTITION BY <constant> clause
(in this case
PARTITION BY 1) puts all rows in the same partition.
SELECT * FROM stocks_table AS st, TABLE(my_udtf(st.symbol, st.transaction_date, st.price) OVER (PARTITION BY 1))
For a more complete and realistic example, see Examples of Calling Java UDTFs in Queries, in particular the subsection titled Single Partition.
Sorting Rows for Partitions
To process each partition’s rows in a specified order, include an ORDER BY clause. This tells Snowflake to pass the rows to the per-row handler method in the specified order.
For example, if you want to calculate the moving average of a stock price over time, then order the stock prices by timestamp (and partition by stock symbol). The following example shows how to do this:
SELECT * FROM stocks_table AS st, TABLE(my_udtf(st.symbol, st.transaction_date, st.price) OVER (PARTITION BY st.symbol ORDER BY st.transaction_date))
An OVER clause can contain an ORDER BY clause even without a PARTITION BY clause.
Remember that including an ORDER BY clause inside an OVER clause is not the same as putting an ORDER BY clause at the outermost level of the query. If you want the entire query results to be ordered, you need a separate ORDER BY clause. For example:
SELECT * FROM stocks_table AS st, TABLE(my_udtf(st.symbol, st.transaction_date, st.price) OVER (PARTITION BY st.symbol ORDER BY st.transaction_date)) ORDER BY st.symbol, st.transaction_date, st.transaction_time;
Usage Notes for Explicit Partitioning
When using a UDTF with a PARTITION BY clause, the PARTITION BY clause must use a column reference or a literal, not a general expression. For example, the following is not allowed:
SELECT * FROM udtf_table, TABLE(my_func(col1) OVER (PARTITION BY udtf_table.col2 * 2)); -- NO!
If a table function does not explicitly partition the rows by using a PARTITION BY clause, then Snowflake typically partitions the rows implicitly to use parallel processing to improve performance.
The number of partitions is typically based on factors such as the size of the warehouse processing the function and the cardinality of the input relation. The rows are typically assigned to specific partitions based on factors such as physical location of the rows (e.g. by micro-partition), so the partition grouping has no meaning.