You are viewing documentation about an older version (1.2.0). View latest version

snowflake.snowpark.udtf.UDTFRegistration.register_from_file

UDTFRegistration.register_from_file(file_path: str, handler_name: str, output_schema: StructType | Iterable[str], input_types: List[DataType] | None = None, name: str | Iterable[str] | None = None, is_permanent: bool = False, stage_location: str | None = None, imports: List[str | Tuple[str, str]] | None = None, packages: List[str | module] | None = None, replace: bool = False, if_not_exists: bool = False, parallel: int = 4, strict: bool = False, secure: bool = False, *, statement_params: Dict[str, str] | None = None) UserDefinedTableFunction[source]

Registers a Python class as a Snowflake Python UDTF from a Python or zip file, and returns the UDTF. Apart from file_path and func_name, the input arguments of this method are the same as register(). See examples in UDTFRegistration.

Parameters:
  • file_path – The path of a local file or a remote file in the stage. See more details on path argument of session.add_import(). Note that unlike path argument of session.add_import(), here the file can only be a Python file or a compressed file (e.g., .zip file) containing Python modules.

  • handler_name – The Python class name in the file that the UDTF will use as the handler.

  • output_schema – A list of column names, or a StructType instance that represents the table function’s columns.

  • input_types – A list of DataType representing the input data types of the UDTF. Optional if type hints are provided.

  • name – A string or list of strings that specify the name or fully-qualified object identifier (database name, schema name, and function name) for the UDTF in Snowflake, which allows you to call this UDTF in a SQL command or via call_udtf(). If it is not provided, a name will be automatically generated for the UDTF. A name must be specified when is_permanent is True.

  • is_permanent – Whether to create a permanent UDTF. The default is False. If it is True, a valid stage_location must be provided.

  • stage_location – The stage location where the Python file for the UDTF and its dependencies should be uploaded. The stage location must be specified when is_permanent is True, and it will be ignored when is_permanent is False. It can be any stage other than temporary stages and external stages.

  • imports – A list of imports that only apply to this UDTF. You can use a string to represent a file path (similar to the path argument in add_import()) in this list, or a tuple of two strings to represent a file path and an import path (similar to the import_path argument in add_import()). These UDTF-level imports will override the session-level imports added by add_import().

  • packages – A list of packages that only apply to this UDTF. These UDTF-level packages will override the session-level packages added by add_packages() and add_requirements().

  • replace – Whether to replace a UDTF that already was registered. The default is False. If it is False, attempting to register a UDTF with a name that already exists results in a SnowparkSQLException exception being thrown. If it is True, an existing UDTF with the same name is overwritten.

  • if_not_exists – Whether to skip creation of a UDTF when one with the same signature already exists. The default is False. if_not_exists and replace are mutually exclusive and a ValueError is raised when both are set. If it is True and a UDTF with the same signature exists, the UDTF creation is skipped.

  • session – Use this session to register the UDTF. If it’s not specified, the session that you created before calling this function will be used. You need to specify this parameter if you have created multiple sessions before calling this method.

  • parallel – The number of threads to use for uploading UDTF files with the PUT command. The default value is 4 and supported values are from 1 to 99. Increasing the number of threads can improve performance when uploading large UDTF files.

  • strict – Whether the created UDTF is strict. A strict UDTF will not invoke the UDTF if any input is null. Instead, a null value will always be returned for that row. Note that the UDTF might still return null for non-null inputs.

  • secure – Whether the created UDTF is secure. For more information about secure functions, see Secure UDFs.

  • statement_params – Dictionary of statement level parameters to be set while executing this action.

Note::

The type hints can still be extracted from the source Python file if they are provided, but currently are not working for a zip file. Therefore, you have to provide output_schema and input_types when path points to a zip file.

See also