snowflake.snowpark.stored_procedure.StoredProcedureRegistration¶
- class snowflake.snowpark.stored_procedure.StoredProcedureRegistration(session: Optional[Session])[source]¶
Bases:
object
Provides methods to register lambdas and functions as stored procedures in the Snowflake database. For more information about Snowflake Python stored procedures, see Python stored procedures.
session.sproc
returns an object of this class. You can use this object to register stored procedures that you plan to use in the current session or permanently. The methods that register a stored procedure return aStoredProcedure
object.Note that the first parameter of your function should be a snowpark Session. Also, you need to add snowflake-snowpark-python package (version >= 0.4.0) to your session before trying to create a stored procedure.
There are two ways to register a stored procedure with Snowpark:
Use
sproc()
orregister()
. By pointing to a runtime Python function, Snowpark uses cloudpickle to serialize this function to bytecode, and deserialize the bytecode to a Python function on the Snowflake server during stored procedure creation. During the serialization, the global variables used in the Python function will be serialized into the bytecode, but only the name of the module object or any objects from a module that are used in the Python function will be serialized. If the size of the serialized bytecode is over 8K bytes, it will be uploaded to a stage location as a Python file. If it’s under 8K, it will be added to the Stored Procedure in-line code.During the deserialization, Python will look up the corresponding modules and objects by names.
Details could be found in
snowflake.snowpark.udf.UDFRegistration
.Use
register_from_file()
. By pointing to a Python file or a zip file containing Python source code and the target function name, Snowpark uploads this file to a stage (which can also be customized), and load the corresponding function from this file to the Python runtime on the Snowflake server during stored procedure creation. Then this function will be invoked when calling this stored procedure. This approach can address the deficiency of the previous approach that uses cloudpickle, because the source code in this file other than the target function will be loaded during stored procedure creation. Therefore, this approach is useful and efficient when all your Python code is already in source files.
Snowflake supports the following data types for the parameters for a stored procedure:
Python Type
Snowpark Type
SQL Type
int
NUMBER
decimal.Decimal
NUMBER
float
FLOAT
str
STRING
bool
BOOL
datetime.time
TIME
datetime.date
DATE
datetime.datetime
TIMESTAMP
bytes
orbytearray
BINARY
list
ARRAY
dict
OBJECT
Dynamically mapped to the native Python type
VARIANT
dict
GEOGRAPHY
Note
1. Data with the VARIANT SQL type will be converted to a Python type dynamically inside a stored procedure. The following SQL types are converted to
str
in stored procedures rather than native Python types: TIME, DATE, TIMESTAMP and BINARY.2. Data returned as
ArrayType
(list
),MapType
(dict
) orVariantType
(Variant
) by a stored procedure will be represented as a json string. You can calleval()
orjson.loads()
to convert the result to a native Python object. Data returned asGeographyType
(Geography
) by a stored procedure will be represented as a GeoJSON string.3. Currently calling stored procedure that requires VARIANT and GEOGRAPHY input types is not supported in snowpark API.
4. Dataframe returned from
call()
does not support stacking dataframe operations when sql simplifier is disabled, and output columns in return type for the table stored procedure are not defined.- Example 1
Use stored procedure to copy data from one table to another:
>>> import snowflake.snowpark >>> from snowflake.snowpark.functions import sproc >>> >>> session.add_packages('snowflake-snowpark-python') >>> >>> def my_copy(session: snowflake.snowpark.Session, from_table: str, to_table: str, count: int) -> str: ... session.table(from_table).limit(count).write.save_as_table(to_table) ... return "SUCCESS" >>> >>> my_copy_sp = session.sproc.register(my_copy, name="my_copy_sp", replace=True) >>> _ = session.sql("create or replace temp table test_from(test_str varchar) as select randstr(20, random()) from table(generator(rowCount => 100))").collect() >>> >>> # call using sql >>> _ = session.sql("drop table if exists test_to").collect() >>> session.sql("call my_copy_sp('test_from', 'test_to', 10)").collect() [Row(MY_COPY_SP='SUCCESS')] >>> session.table("test_to").count() 10 >>> # call using session.call API >>> _ = session.sql("drop table if exists test_to").collect() >>> session.call("my_copy_sp", "test_from", "test_to", 10) 'SUCCESS' >>> session.table("test_to").count() 10
- Example 2
Create a temporary stored procedure from a lambda and call it:
>>> from snowflake.snowpark.functions import sproc >>> from snowflake.snowpark.types import IntegerType >>> >>> session.add_packages('snowflake-snowpark-python') >>> add_one_sp = sproc( ... lambda session_, x: session_.sql(f"select {x} + 1").collect()[0][0], ... return_type=IntegerType(), ... input_types=[IntegerType()] ... ) >>> add_one_sp(1) 2
- Example 3
Create a stored procedure with type hints and
@sproc
decorator and call it:>>> import snowflake.snowpark >>> from snowflake.snowpark.functions import sproc >>> >>> session.add_packages('snowflake-snowpark-python') >>> @sproc ... def add_sp(session_: snowflake.snowpark.Session, x: int, y: int) -> int: ... return session_.sql(f"select {x} + {y}").collect()[0][0] >>> add_sp(1, 2) 3
- Example 4
Create a permanent stored procedure with a name and call it in SQL:
>>> from snowflake.snowpark.types import IntegerType >>> >>> session.add_packages('snowflake-snowpark-python') >>> _ = session.sql("create or replace temp stage mystage").collect() >>> _ = session.sproc.register( ... lambda session_, x, y: session_.sql(f"SELECT {x} * {y}").collect()[0][0], ... return_type=IntegerType(), ... input_types=[IntegerType(), IntegerType()], ... is_permanent=True, ... name="mul_sp", ... replace=True, ... stage_location="@mystage", ... ) >>> session.sql("call mul_sp(5, 6)").collect() [Row(MUL_SP=30)] >>> # skip stored proc creation if it already exists >>> _ = session.sproc.register( ... lambda session_, x, y: session_.sql(f"SELECT {x} * {y} + 1").collect()[0][0], ... return_type=IntegerType(), ... input_types=[IntegerType(), IntegerType()], ... is_permanent=True, ... name="mul_sp", ... if_not_exists=True, ... stage_location="@mystage", ... ) >>> session.sql("call mul_sp(5, 6)").collect() [Row(MUL_SP=30)] >>> # overwrite stored procedure >>> _ = session.sproc.register( ... lambda session_, x, y: session_.sql(f"SELECT {x} * {y} + 1").collect()[0][0], ... return_type=IntegerType(), ... input_types=[IntegerType(), IntegerType()], ... is_permanent=True, ... name="mul_sp", ... replace=True, ... stage_location="@mystage", ... ) >>> session.sql("call mul_sp(5, 6)").collect() [Row(MUL_SP=31)]
- Example 5
Create a stored procedure with stored-procedure-level imports and call it:
>>> import snowflake.snowpark >>> from resources.test_sp_dir.test_sp_file import mod5 >>> from snowflake.snowpark.functions import sproc >>> >>> session.add_packages('snowflake-snowpark-python') >>> @sproc(imports=[("tests/resources/test_sp_dir/test_sp_file.py", "resources.test_sp_dir.test_sp_file")]) ... def mod5_and_plus1_sp(session_: snowflake.snowpark.Session, x: int) -> int: ... return mod5(session_, x) + 1 >>> mod5_and_plus1_sp(2) 3
- Example 6
Create a stored procedure with stored-procedure-level packages and call it:
>>> import snowflake.snowpark >>> from snowflake.snowpark.functions import sproc >>> import numpy as np >>> import math >>> >>> @sproc(packages=["snowflake-snowpark-python", "numpy"]) ... def sin_sp(_: snowflake.snowpark.Session, x: float) -> float: ... return np.sin(x) >>> sin_sp(0.5 * math.pi) 1.0
- Example 7
Creating a stored procedure from a local Python file:
>>> session.add_packages('snowflake-snowpark-python') >>> # mod5() in that file has type hints >>> mod5_sp = session.sproc.register_from_file( ... file_path="tests/resources/test_sp_dir/test_sp_file.py", ... func_name="mod5", ... ) >>> mod5_sp(2) 2
- Example 8
Creating a stored procedure from a Python file on an internal stage:
>>> from snowflake.snowpark.types import IntegerType >>> >>> session.add_packages('snowflake-snowpark-python') >>> _ = session.sql("create or replace temp stage mystage").collect() >>> _ = session.file.put("tests/resources/test_sp_dir/test_sp_file.py", "@mystage", auto_compress=False) >>> mod5_sp = session.sproc.register_from_file( ... file_path="@mystage/test_sp_file.py", ... func_name="mod5", ... return_type=IntegerType(), ... input_types=[IntegerType()], ... ) >>> mod5_sp(2) 2
- Example 9
Creating a table stored procedure with return type while defining return columns and datatypes:
>>> from snowflake.snowpark.types import IntegerType, StructField, StructType >>> @sproc(return_type=StructType([StructField("A", IntegerType()), StructField("B", IntegerType())]), input_types=[IntegerType(), IntegerType()]) ... def select_sp(session_, x, y): ... return session_.sql(f"SELECT {x} as A, {y} as B") ... >>> select_sp(1, 2).show() ------------- |"A" |"B" | ------------- |1 |2 | -------------
- Example 10
Creating a table stored procedure with return type with free return columns:
>>> from snowflake.snowpark.types import IntegerType, StructType >>> @sproc(return_type=StructType(), input_types=[IntegerType(), IntegerType()]) ... def select_sp(session_, x, y): ... return session_.sql(f"SELECT {x} as A, {y} as B") ... >>> select_sp(1, 2).show() ------------- |"A" |"B" | ------------- |1 |2 | -------------
- Example 11
Creating a table stored procedure using implicit type hints:
>>> from snowflake.snowpark.dataframe import DataFrame >>> @sproc ... def select_sp(session_: snowflake.snowpark.Session, x: int, y: int) -> DataFrame: ... return session_.sql(f"SELECT {x} as A, {y} as B") ... >>> select_sp(1, 2).show() ------------- |"A" |"B" | ------------- |1 |2 | -------------
See also
Methods
describe
(sproc_obj)Returns a
DataFrame
that describes the properties of a stored procedure.register
(func[, return_type, input_types, ...])Registers a Python function as a Snowflake Python stored procedure and returns the stored procedure.
register_from_file
(file_path, func_name[, ...])Registers a Python function as a Snowflake Python stored procedure from a Python or zip file, and returns the stored procedure.