You are viewing documentation about an older version (1.2.0). View latest version

snowflake.snowpark.stored_procedure.StoredProcedureRegistration

class snowflake.snowpark.stored_procedure.StoredProcedureRegistration(session: Session)[source]

Bases: object

Provides methods to register lambdas and functions as stored procedures in the Snowflake database. For more information about Snowflake Python stored procedures, see Python stored procedures.

session.sproc returns an object of this class. You can use this object to register stored procedures that you plan to use in the current session or permanently. The methods that register a stored procedure return a StoredProcedure object.

Note that the first parameter of your function should be a snowpark Session. Also, you need to add snowflake-snowpark-python package (version >= 0.4.0) to your session before trying to create a stored procedure.

There are two ways to register a stored procedure with Snowpark:

  • Use sproc() or register(). By pointing to a runtime Python function, Snowpark uses cloudpickle to serialize this function to bytecode, and deserialize the bytecode to a Python function on the Snowflake server during stored procedure creation. During the serialization, the global variables used in the Python function will be serialized into the bytecode, but only the name of the module object or any objects from a module that are used in the Python function will be serialized. If the size of the serialized bytecode is over 8K bytes, it will be uploaded to a stage location as a Python file. If it’s under 8K, it will be added to the Stored Procedure in-line code.

    During the deserialization, Python will look up the corresponding modules and objects by names.

    Details could be found in snowflake.snowpark.udf.UDFRegistration.

  • Use register_from_file(). By pointing to a Python file or a zip file containing Python source code and the target function name, Snowpark uploads this file to a stage (which can also be customized), and load the corresponding function from this file to the Python runtime on the Snowflake server during stored procedure creation. Then this function will be invoked when calling this stored procedure. This approach can address the deficiency of the previous approach that uses cloudpickle, because the source code in this file other than the target function will be loaded during stored procedure creation. Therefore, this approach is useful and efficient when all your Python code is already in source files.

Snowflake supports the following data types for the parameters for a stored procedure:

Python Type

Snowpark Type

SQL Type

int

LongType

NUMBER

decimal.Decimal

DecimalType

NUMBER

float

FloatType

FLOAT

str

StringType

STRING

bool

BooleanType

BOOL

datetime.time

TimeType

TIME

datetime.date

DateType

DATE

datetime.datetime

TimestampType

TIMESTAMP

bytes or bytearray

BinaryType

BINARY

list

ArrayType

ARRAY

dict

MapType

OBJECT

Dynamically mapped to the native Python type

VariantType

VARIANT

dict

GeographyType

GEOGRAPHY

Note

1. Data with the VARIANT SQL type will be converted to a Python type dynamically inside a stored procedure. The following SQL types are converted to str in stored procedures rather than native Python types: TIME, DATE, TIMESTAMP and BINARY.

2. Data returned as ArrayType (list), MapType (dict) or VariantType (Variant) by a stored procedure will be represented as a json string. You can call eval() or json.loads() to convert the result to a native Python object. Data returned as GeographyType (Geography) by a stored procedure will be represented as a GeoJSON string.

3. Currently calling stored procedure that requires VARIANT and GEOGRAPHY input types is not supported in snowpark API.

Example 1

Use stored procedure to copy data from one table to another:

>>> import snowflake.snowpark
>>> from snowflake.snowpark.functions import sproc
>>>
>>> session.add_packages('snowflake-snowpark-python')
>>>
>>> def my_copy(session: snowflake.snowpark.Session, from_table: str, to_table: str, count: int) -> str:
...     session.table(from_table).limit(count).write.save_as_table(to_table)
...     return "SUCCESS"
>>>
>>> my_copy_sp = session.sproc.register(my_copy, name="my_copy_sp", replace=True)
>>> _ = session.sql("create or replace temp table test_from(test_str varchar) as select randstr(20, random()) from table(generator(rowCount => 100))").collect()
>>>
>>> # call using sql
>>> _ = session.sql("drop table if exists test_to").collect()
>>> session.sql("call my_copy_sp('test_from', 'test_to', 10)").collect()
[Row(MY_COPY_SP='SUCCESS')]
>>> session.table("test_to").count()
10
>>> # call using session#call API
>>> _ = session.sql("drop table if exists test_to").collect()
>>> session.call("my_copy_sp", "test_from", "test_to", 10)
'SUCCESS'
>>> session.table("test_to").count()
10
Copy
Example 2

Create a temporary stored procedure from a lambda and call it:

>>> from snowflake.snowpark.functions import sproc
>>> from snowflake.snowpark.types import IntegerType
>>>
>>> session.add_packages('snowflake-snowpark-python')
>>> add_one_sp = sproc(
...     lambda session_, x: session_.sql(f"select {x} + 1").collect()[0][0],
...     return_type=IntegerType(),
...     input_types=[IntegerType()]
... )
>>> add_one_sp(1)
2
Copy
Example 3

Create a stored procedure with type hints and @sproc decorator and call it:

>>> import snowflake.snowpark
>>> from snowflake.snowpark.functions import sproc
>>>
>>> session.add_packages('snowflake-snowpark-python')
>>> @sproc
... def add_sp(session_: snowflake.snowpark.Session, x: int, y: int) -> int:
...    return session_.sql(f"select {x} + {y}").collect()[0][0]
>>> add_sp(1, 2)
3
Copy
Example 4

Create a permanent stored procedure with a name and call it in SQL:

>>> from snowflake.snowpark.types import IntegerType
>>>
>>> session.add_packages('snowflake-snowpark-python')
>>> _ = session.sql("create or replace temp stage mystage").collect()
>>> _ = session.sproc.register(
...     lambda session_, x, y: session_.sql(f"SELECT {x} * {y}").collect()[0][0],
...     return_type=IntegerType(),
...     input_types=[IntegerType(), IntegerType()],
...     is_permanent=True,
...     name="mul_sp",
...     replace=True,
...     stage_location="@mystage",
... )
>>> session.sql("call mul_sp(5, 6)").collect()
[Row(MUL_SP=30)]
>>> # skip stored proc creation if it already exists
>>> _ = session.sproc.register(
...     lambda session_, x, y: session_.sql(f"SELECT {x} * {y} + 1").collect()[0][0],
...     return_type=IntegerType(),
...     input_types=[IntegerType(), IntegerType()],
...     is_permanent=True,
...     name="mul_sp",
...     if_not_exists=True,
...     stage_location="@mystage",
... )
>>> session.sql("call mul_sp(5, 6)").collect()
[Row(MUL_SP=30)]
>>> # overwrite stored procedure
>>> _ = session.sproc.register(
...     lambda session_, x, y: session_.sql(f"SELECT {x} * {y} + 1").collect()[0][0],
...     return_type=IntegerType(),
...     input_types=[IntegerType(), IntegerType()],
...     is_permanent=True,
...     name="mul_sp",
...     replace=True,
...     stage_location="@mystage",
... )
>>> session.sql("call mul_sp(5, 6)").collect()
[Row(MUL_SP=31)]
Copy
Example 5

Create a stored procedure with stored-procedure-level imports and call it:

>>> import snowflake.snowpark
>>> from resources.test_sp_dir.test_sp_file import mod5
>>> from snowflake.snowpark.functions import sproc
>>>
>>> session.add_packages('snowflake-snowpark-python')
>>> @sproc(imports=[("tests/resources/test_sp_dir/test_sp_file.py", "resources.test_sp_dir.test_sp_file")])
... def mod5_and_plus1_sp(session_: snowflake.snowpark.Session, x: int) -> int:
...     return mod5(session_, x) + 1
>>> mod5_and_plus1_sp(2)
3
Copy
Example 6

Create a stored procedure with stored-procedure-level packages and call it:

>>> import snowflake.snowpark
>>> from snowflake.snowpark.functions import sproc
>>> import numpy as np
>>> import math
>>>
>>> @sproc(packages=["snowflake-snowpark-python", "numpy"])
... def sin_sp(_: snowflake.snowpark.Session, x: float) -> float:
...     return np.sin(x)
>>> sin_sp(0.5 * math.pi)
1.0
Copy
Example 7

Creating a stored procedure from a local Python file:

>>> session.add_packages('snowflake-snowpark-python')
>>> # mod5() in that file has type hints
>>> mod5_sp = session.sproc.register_from_file(
...     file_path="tests/resources/test_sp_dir/test_sp_file.py",
...     func_name="mod5",
... )
>>> mod5_sp(2)
2
Copy
Example 8

Creating a stored procedure from a Python file on an internal stage:

>>> from snowflake.snowpark.types import IntegerType
>>>
>>> session.add_packages('snowflake-snowpark-python')
>>> _ = session.sql("create or replace temp stage mystage").collect()
>>> _ = session.file.put("tests/resources/test_sp_dir/test_sp_file.py", "@mystage", auto_compress=False)
>>> mod5_sp = session.sproc.register_from_file(
...     file_path="@mystage/test_sp_file.py",
...     func_name="mod5",
...     return_type=IntegerType(),
...     input_types=[IntegerType()],
... )
>>> mod5_sp(2)
2
Copy

Methods

describe(sproc_obj)

Returns a DataFrame that describes the properties of a stored procedure.

register(func[, return_type, input_types, ...])

Registers a Python function as a Snowflake Python stored procedure and returns the stored procedure.

register_from_file(file_path, func_name[, ...])

Registers a Python function as a Snowflake Python stored procedure from a Python or zip file, and returns the stored procedure.