Upload and run custom functions in a clean room¶
Overview¶
You can upload custom Python UDFs and UDTFs into your clean room and run them from your templates to perform complex data actions. These actions include machine learning or customized data manipulation within a query, as part of a single-step or multi-step flow. Python is the only coding language supported for custom UDFs.
Your uploaded code can import and use packages from an approved bundle of Python packages and the Snowpark API.
Both providers and consumers can upload custom Python code into a clean room, although the process is different for providers and consumers. Each bundle of uploaded code can define multiple functions that call each other, but a bundle exposes only one handler function. This handler function can be called by templates created or run by anyone who uses the clean room. If the code creates internal tables, these tables can be accessed as described in Designing multi-step flows.
You can’t view uploaded code, even your own code. Uploaded code can’t be deleted, but it can be updated.
This page shows you how to upload and run custom Python UDFs and UDTFs as a provider or a consumer.
Tip
For background information about how to develop your own Python UDFs in a clean room, see the following topics:
How UDFs work on Snowflake for general background on how to write Python functions in Snowflake.
How to write UDTFs in Snowflake if you want to return tables from your functions.
How to create and upload custom templates into a clean room. UDFs/UDTFs are called from a custom template.
Using Snowpark in clean rooms (if you want to call your UDFs from Snowpark).
Updating custom functions¶
You can upload or overwrite an existing function that you uploaded, but you can’t delete an existing function.
Uploading a function with the exact same signature as one that you previously uploaded will overwrite the existing function. The signature is the case-insensitive function name of the external handler, and the data types of all parameters, in the same order. Parameter names don’t matter. You can’t overwrite a function uploaded by another account.
Because the signature must match when you update a function, you cannot change the signature of an existing function: if you upload the
function foo(name VARIANT age INTEGER)
and then upload the function foo(name VARIANT age FLOAT)
, the second function will be added
to the clean room in addition to the first, because the argument types differ.
Provider-submitted code¶
Provider-submitted functions can be uploaded as inline code or from a Snowflake stage. Both techniques are covered here.
Your uploaded code can natively import and use packages from an approved set of Python packages. If you need a non-default package, you must use Snowpark Container Services in a clean room to host your code.
Tip
After updating provider-written code, you should update the default release directive and then call
provider.create_or_update_cleanroom_listing
to propagate the changes to consumers. If you do not call
provider.create_or_update_cleanroom_listing
, your default version will not update for consumers that are currently using the clean
room.
Here is a high level view of how a provider adds code to a clean room:
The provider creates and configures the clean room in the normal way.
The provider uploads code by calling
provider.load_python_into_cleanroom
. You can either upload your code inline directly within that procedure, or upload a code file to a stage, then provide the stage location to that procedure.Although your code can include multiple functions, only one handler is exposed for each upload. If you expose multiple functions to templates, upload each handler by calling
provider.load_python_into_cleanroom
.After each successful code upload, a new patch version of the clean room is generated. You must then increase the default version by calling
provider.set_default_release_directive
with the new patch number. If the clean room is exposed externally, security checks are run before installing your code, and you must callprovider.view_cleanroom_scan_status
to confirm that security checks have passed before incrementing the default version.You create and upload a custom template that calls your code. The template calls the handler function using the
cleanroom
scope, that is:cleanroom.my_function
. For example, a template that calls a customsimple_add
function that you uploaded might look like this:SELECT cleanroom.simple_add(1, 2), cleanroom.simple_add({{ price | sqlsafe | int }}, {{ tax | sqlsafe | int }})
The consumer runs your template the same way as any other template.
Tip
If the consumer encounters a mount error when they install a clean room with custom code, this can indicate a syntax error in the code.
You can find code examples demonstrating this flow in the provider-written code example section.
Important notes about versioning¶
Every time the provider uploads a function, it increases the patch number (and there is a limit of 99 patch numbers). Therefore, do your best to test and debug your code thoroughly before adding it to the clean room to reduce version updates during development.
If you do update a patch number, customers using the clean room UI might need to refresh the page to see the change. Customers using the API should see changes immediately, but there can be a delay, depending on the available resources. Learn more about clean room versioning.
Uploading provider-written inline functions¶
You can upload the code inline in the code
parameter of provider.load_python_into_cleanroom
. Here is an example of uploading a
simple function inline:
CALL samooha_by_snowflake_local_db.provider.load_python_into_cleanroom(
$cleanroom_name,
'simple_add', -- Name used to call the UDF from a template.
['first INTEGER', 'second INTEGER'], -- Arguments of the UDF, specified as '<variable_name> <SQL type>' pairs.
['numpy', 'pandas'], -- Packages imported by the UDF.
'INTEGER', -- SQL return type of UDF.
'add_two', -- Handler function in your code called when external name is called.
$$
import numpy as np # Not used, but you can load supported packages.
import pandas as pd
def add_two(first, second):
return first + second
$$
);
The calling template calls cleanroom.simple_add
to call this function.
The provider examples demonstrate how to upload inline code.
Uploading provider-written functions from a stage¶
You can upload Python files to a clean room stage and reference the stage when you call provider.load_python_into_cleanroom
. Loading
code from a stage allows you to develop the code in your local system in an editor, avoid copy/paste errors loading it inline, and also have
better versioning control. Note that you can upload multiple files in one procedure call, but only one handler function is exposed for each
upload.
Code is loaded from a stage into the clean room when you call load_python_into_cleanroom
; later changes to the code on the stage are not
propagated to the clean room.
To upload your UDF to a stage:
Create your .py file and make it available in a location where you can upload it to a Snowsight stage.
To get the name of the stage for your clean room, call
provider.get_stage_for_python_files($cleanroom_name)
. This stage is accessible by the clean room – you cannot use an arbitrary stage that you create.Upload the .py file to the stage for your clean room. There are several ways to do this, including using the CLI, Snowsight, or language-specific drivers.
Call
provider.load_python_into_cleanroom
with the stage location, handler, external name, arguments, and return type. Templates in your clean room can now call the function.
The following example code shows how to load code into a clean room from a stage.
-- Save the following code as reverser.py:
--import numpy as np
--def main(some_string):
-- '''Return the reverse of a string plus a random number 1-10'''
-- return some_string[::-1] + str(np.random.randint(1,10))
-- Get the stage for your clean room.
CALL samooha_by_snowflake_local_db.provider.get_stage_for_python_files($cleanroom_name);
-- Save the file to the stage. Here is how to do it by using the Snowflake CLI
PUT file://~/reverser.py <STAGE_NAME> overwrite=True auto_compress=False;
-- Load the code from the stage into the clean room.
CALL samooha_by_snowflake_local_db.provider.load_python_into_cleanroom(
$cleanroom_name,
'reverse', -- Name used to call the function
['some_string STRING'], -- Arguments and SQL types
['numpy'], -- Any required packages
['/reverser.py'], -- Relative path to file on stage
'STRING', -- Return type
'reverser.main' -- <FILE_NAME>.<FUNCTION_NAME>
);
-- Uploading code, even from a stage, increases the patch number.
CALL samooha_by_snowflake_local_db.provider.set_default_release_directive(
$cleanroom_name, 'V1_0', <NEW_PATCH_NUMBER>);
-- Upload a template that calls the function.
CALL samooha_by_snowflake_local_db.provider.add_custom_sql_template(
$cleanroom_name,
$udf_template_name,
$$
SELECT
p.status,
cleanroom.reverse(p.status)
FROM SAMOOHA_SAMPLE_DATABASE.DEMO.CUSTOMERS AS p
LIMIT 100;
$$
);
-- Switch to the consumer account and run the template to see the results.
The provider examples demonstrate uploading code from a stage.
Provider-written code examples¶
The following examples demonstrate adding provider-written UDFs and UDTFs to a clean room.
Download the following examples and then upload them as worksheet files in your Snowflake account. You need separate accounts for the provider and consumer, each with the clean rooms API installed. Replace the information as noted in the sample files.
Loading a file from a stage
. Run this notebook after you run the provider example to try loading a UDF from a stage.
Consumer-submitted code¶
A consumer can submit UDF or UDTF code and call it from a custom template that they submit. Consumer-uploaded code is bundled into a single procedure along with a custom template, and uploaded in a single procedure call. The consumer code is tied directly to that template and cannot be called by other templates.
To upload code as a consumer, you should understand custom template syntax and how to submit a consumer-defined template.
Here is an overview of the steps to upload custom consumer code:
The provider creates the clean room in the standard way and then invites the consumer.
The consumer installs and configures the clean room in the standard way.
The consumer prepares a template. The template calls the UDF or UDTF within the
cleanroom
namespace. For example, to call the consumer-definedcalculate_tax
function, a simple template might look like the following snippet:SELECT {{ cleanroom.calculate_tax(p.cost) }} AS Tax FROM my_db.my_sch.sales AS p;
The consumer prepares their Python code. We recommend using double quotation marks (“”) rather than single quotation marks (’) in your code to avoid extra escaping needed later. Your code can reference a bundle of selected Python libraries.
The consumer passes their Python code into
consumer.generate_python_request_template
. The procedure returns the Python code as a stored procedure, with a placeholder for the custom JinjaSQL template. There are several multi-line strings in the template that use$$
as multi-line delimiters.Replace the template placeholder in the output from
generate_python_request_template
with your JinjaSQL template.In the combined template, escape any single quotes like this:
\'
. This is because single quotes will be used as the outermost delimiter for the entire multi-line procedure string. For example:BEGIN CREATE OR REPLACE FUNCTION CLEANROOM.custom_compare(min_status STRING, max_status STRING, this_status STRING) RETURNS boolean LANGUAGE PYTHON RUNTIME_VERSION = 3.10 PACKAGES = (\'numpy\') HANDLER = \'custom_compare\' AS $$ import numpy as np def custom_compare(min_status:str, max_status:str, this_status:str): statuses = [\'MEMBER\', \'SILVER\', \'GOLD\', \'PLATINUM\'] return ((statuses.index(this_status) >= statuses.index(min_status)) & (statuses.index(this_status) <= statuses.index(max_status))) $$; -- Custom template LET SQL_TEXT varchar := $$ SELECT c.status, c.hashed_email FROM IDENTIFIER( {{ my_table[0] }} ) as c WHERE cleanroom.custom_compare({{ min_status }}, {{ max_status }}, c.status); $$; LET RES resultset := (EXECUTE IMMEDIATE :SQL_TEXT); RETURN TABLE(RES); END;
Call
consumer.create_template_request
with your combined template. Use single quotation marks ‘’ instead of double dollar sign delimiters $$…$$ around the code you provide for stored procedure in thetemplate_definition
argument. For example:CALL samooha_by_snowflake_local_db.consumer.create_template_request( $cleanroom_name, $template_name, ' BEGIN -- First, define the Python UDF. CREATE OR REPLACE FUNCTION CLEANROOM.custom_compare(min_status STRING, max_status STRING, this_status STRING) RETURNS boolean LANGUAGE PYTHON RUNTIME_VERSION = 3.10 PACKAGES = (\'numpy\') HANDLER = \'custom_compare\' AS $$ import numpy as np def custom_compare(min_status:str, max_status:str, this_status:str): statuses = [\'MEMBER\', \'SILVER\', \'GOLD\', \'PLATINUM\'] return ((statuses.index(this_status) >= statuses.index(min_status)) & (statuses.index(this_status) <= statuses.index(max_status))) $$; -- Then define and execute the SQL query. LET SQL_TEXT varchar := $$ SELECT c.status, c.hashed_email FROM IDENTIFIER( {{ my_table[0] }} ) as c WHERE cleanroom.custom_compare({{ min_status }}, {{ max_status }}, c.status); $$; -- Execute the query and then return the result. LET RES resultset := (EXECUTE IMMEDIATE :SQL_TEXT); RETURN TABLE(RES); END; ');
The consumer and provider continue with the standard consumer-defined template flow:
The provider views (
provider.list_pending_template_requests
) and then approves the template request (approve_template_request
)The consumer checks the request status (
consumer.list_template_requests
), and when the status is APPROVED, runs the template (consumer.run_analysis
).
Consumer-written code examples¶
The following examples demonstrate adding provider-written UDFs to a clean room.
Download the following examples and then upload them as worksheet files in your Snowflake account. You need separate accounts for the provider and consumer, each with the clean rooms API installed. Replace the information as noted in the sample files: