Snowpark Pythonでの関数とストアドプロシージャの呼び出し¶

DataFrame でデータを処理するには、システム定義の SQL 関数、ユーザー定義関数、およびストアドプロシージャを呼び出すことができます。このトピックでは、Snowparkでこれらを呼び出す方法について説明します。

DataFrame でデータを処理するには、システム定義の SQL 関数、ユーザー定義関数、およびストアドプロシージャを呼び出すことができます。

システム定義関数の呼び出し¶

システム定義のSQL関数を呼び出す必要がある場合は、 snowflake.snowpark.functions モジュールで同等の関数を使用します。

次の例では、 functions モジュールの upper 関数（システム定義の UPPER 関数と同等）を呼び出して、 sample_product_data テーブルのname列の値を大文字で返します。

# Import the upper function from the functions module.
from snowflake.snowpark.functions import upper, col
session.table("sample_product_data").select(upper(col("name")).alias("upper_name")).collect()

[Row(UPPER_NAME='PRODUCT 1'), Row(UPPER_NAME='PRODUCT 1A'), Row(UPPER_NAME='PRODUCT 1B'), Row(UPPER_NAME='PRODUCT 2'),
Row(UPPER_NAME='PRODUCT 2A'), Row(UPPER_NAME='PRODUCT 2B'), Row(UPPER_NAME='PRODUCT 3'), Row(UPPER_NAME='PRODUCT 3A'),
Row(UPPER_NAME='PRODUCT 3B'), Row(UPPER_NAME='PRODUCT 4'), Row(UPPER_NAME='PRODUCT 4A'), Row(UPPER_NAME='PRODUCT 4B')]

システム定義のSQL関数が関数モジュールで使用できない場合は、次のいずれかの方法を使用できます。

call_function 関数を使用して、システム定義関数を呼び出します。
function 関数を使用して、システム定義関数の呼び出しに使用できる関数オブジェクトを作成します。

call_function および function は、 snowflake.snowpark.functions モジュールで定義されています。

call_function の場合は、最初の引数としてシステム定義関数の名前を渡します。列の値をシステム定義関数に渡す必要がある場合は、列オブジェクトを call_function 関数への追加の引数として定義して渡します。

次の例では、システム定義関数 RADIANS を呼び出し、列 col1 から値を渡します。

# Import the call_function function from the functions module.
from snowflake.snowpark.functions import call_function
df = session.create_dataframe([[1, 2], [3, 4]], schema=["col1", "col2"])
# Call the system-defined function RADIANS() on col1.
df.select(call_function("radians", col("col1"))).collect()

[Row(RADIANS("COL1")=0.017453292519943295), Row(RADIANS("COL1")=0.05235987755982988)]

call_function 関数は、 Column を返します。これは、 DataFrame 変換メソッド（例: filter、 select など）に渡すことができます。

function の場合は、システム定義関数の名前を渡し、返された関数オブジェクトを使用してシステム定義関数を呼び出します。例:

# Import the call_function function from the functions module.
from snowflake.snowpark.functions import function

# Create a function object for the system-defined function RADIANS().
radians = function("radians")
df = session.create_dataframe([[1, 2], [3, 4]], schema=["col1", "col2"])
# Call the system-defined function RADIANS() on col1.
df.select(radians(col("col1"))).collect()

[Row(RADIANS("COL1")=0.017453292519943295), Row(RADIANS("COL1")=0.05235987755982988)]

ユーザー定義関数（UDFs）の呼び出し¶

名前で登録したUDFsと、CREATE FUNCTIONを実行して作成したUDFsを呼び出すには、 snowflake.snowpark.functions モジュールで call_udf 関数を使用します。UDF の名前を最初の引数として渡し、 UDF パラメーターを追加の引数として渡します。

次の例では、 UDF 関数 minus_one を呼び出し、列 col1 と col2 から値を渡します。この例では、戻り値を minus_one から DataFrame の select メソッドに渡します。

# Import the call_udf function from the functions module.
from snowflake.snowpark.functions import call_udf

# Runs the scalar function 'minus_one' on col1 of df.
df = session.create_dataframe([[1, 2], [3, 4]], schema=["col1", "col2"])
df.select(call_udf("minus_one", col("col1"))).collect()

[Row(MINUS_ONE("COL1")=0), Row(MINUS_ONE("COL1")=2)]

ユーザー定義のテーブル関数（UDTFs）の呼び出し¶

名前で登録したUDTFsと、CREATE FUNCTIONを実行して作成したUDTFsを呼び出すには、以下にリストされている関数のいずれかを使用します。どちらも、遅延評価されたリレーショナルデータセットを表す DataFrame を返します。

これらを使用して、システム定義のテーブル関数を含む他のテーブル関数を呼び出すこともできることに注意してください。

UDTFの登録の詳細については、 UDTFの登録をご参照ください。

横方向の結合を指定せずにUDTFを呼び出すには、 snowflake.snowpark.Session クラスの table_function 関数を呼び出します。

関数のリファレンスと例については、 Session.table_function をご参照ください。

次の例のコードは、 table_function を使用して、 udtf 関数に登録されている generator_udtf 関数を呼び出します。

from snowflake.snowpark.types import IntegerType, StructField, StructType
from snowflake.snowpark.functions import udtf, lit
class GeneratorUDTF:
    def process(self, n):
        for i in range(n):
            yield (i, )
generator_udtf = udtf(GeneratorUDTF, output_schema=StructType([StructField("number", IntegerType())]), input_types=[IntegerType()])
session.table_function(generator_udtf(lit(3))).collect()

[Row(NUMBER=0), Row(NUMBER=1), Row(NUMBER=2)]

呼び出しで横方向の結合が指定されているUDTFを呼び出すには、 snowflake.snowpark.DataFrame クラスの join_table_function 関数を使用します。

UDTFを横方向に結合する場合、PARTITION BY句とORDER BY句を指定できます。

関数のリファレンスと例については、 DataFrame.join_table_function をご参照ください。

次の例のコードは、 partition_by パラメーターと order_by パラメーターを指定して、横方向の結合を実行します。この例のコードは、最初に snowflake.snowpark.functions.table_function 関数を呼び出して、システム定義の SPLIT_TO_TABLE 関数を表す関数オブジェクトを作成します。 join_table_function が呼び出すのはこの関数オブジェクトです。

snowflake.snowpark.functions.table_function 関数のリファレンスについては、 table_function をご参照ください。 SPLIT_TO_TABLE 関数のリファレンスについては、 SPLIT_TO_TABLE をご参照ください。

from snowflake.snowpark.functions import table_function
split_to_table = table_function("split_to_table")
df = session.create_dataframe([
  ["John", "James", "address1 address2 address3"],
  ["Mike", "James", "address4 address5 address6"],
  ["Cathy", "Stone", "address4 address5 address6"],
],
schema=["first_name", "last_name", "addresses"])
df.join_table_function(split_to_table(df["addresses"], lit(" ")).over(partition_by="last_name", order_by="first_name")).show()

----------------------------------------------------------------------------------------
|"FIRST_NAME"  |"LAST_NAME"  |"ADDRESSES"                 |"SEQ"  |"INDEX"  |"VALUE"   |
----------------------------------------------------------------------------------------
|John          |James        |address1 address2 address3  |1      |1        |address1  |
|John          |James        |address1 address2 address3  |1      |2        |address2  |
|John          |James        |address1 address2 address3  |1      |3        |address3  |
|Mike          |James        |address4 address5 address6  |2      |1        |address4  |
|Mike          |James        |address4 address5 address6  |2      |2        |address5  |
|Mike          |James        |address4 address5 address6  |2      |3        |address6  |
|Cathy         |Stone        |address4 address5 address6  |3      |1        |address4  |
|Cathy         |Stone        |address4 address5 address6  |3      |2        |address5  |
|Cathy         |Stone        |address4 address5 address6  |3      |3        |address6  |
----------------------------------------------------------------------------------------

ストアドプロシージャの呼び出し¶

ストアドプロシージャを呼び出すには、 Session クラスの呼び出しメソッドを使用します。

session.call("your_proc_name", 1)