snowflake.ml.model.TransformersPipeline¶

class snowflake.ml.model.TransformersPipeline(task: Optional[str], model: str, *, revision: Optional[str] = None, token_or_secret: Optional[str] = None, trust_remote_code: Optional[bool] = None, model_kwargs: Optional[dict[str, Any]] = None, compute_pool_for_log: Optional[str] = 'SYSTEM_COMPUTE_POOL_CPU', allow_patterns: Optional[Union[list[str], str]] = None, ignore_patterns: Optional[Union[list[str], str]] = None, lazy_upload: bool = True, **kwargs: Any)¶

Bases: object

Utility factory method to build a wrapper over transformers [Pipeline]. When deploying, this wrapper will create a real pipeline object and loading tokenizers and models.

For pipelines docs, please refer: https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.pipeline

Parameters:

task – The HuggingFace pipeline task name (e.g. "text-classification"). Required when constructing TransformersPipeline directly. Subclasses that set _requires_task = False (such as SentenceTransformer) may pass None. For available tasks, please refer to Transformers’s documentation.
model – The model that will be used by the pipeline to make predictions. This can only be a model identifier currently. Must be explicitly provided.
revision – When passing a task name or a string model identifier: The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git. Defaults to None.
token_or_secret – The token to use as HTTP bearer authorization for remote files. Defaults to None. The token can be a token or a secret. If a secret is provided, it must a fully qualified secret name.
trust_remote_code – Whether or not to allow for custom code defined on the Hub in their own modeling, configuration, tokenization or even pipeline files. This option should only be set to True for repositories you trust and in which you have read the code, as it will execute code present on the Hub. Defaults to None.
model_kwargs – Additional dictionary of keyword arguments passed along to the model’s from_pretrained(…,. Defaults to None.
compute_pool_for_log – The compute pool to use for logging the model. Defaults to DEFAULT_CPU_COMPUTE_POOL. If a string is provided, it will be used as the compute pool name. This override allows for logging the model when there is no system compute pool available. If None is passed, the huggingface_hub package must be installed and the model artifacts will be downloaded from the HuggingFace repository.
allow_patterns – If provided, only files matching at least one pattern are downloaded.
ignore_patterns – If provided, files matching any of the patterns are not downloaded.
lazy_upload – When compute_pool_for_log is None, list HuggingFace repository files at construction time and stream each file to Snowflake during log_model instead of downloading the full snapshot locally. Defaults to True. Set to False to download the entire repository before logging.
kwargs – Additional keyword arguments passed along to the specific pipeline init (see the documentation for the corresponding pipeline class for possible values).