Snowflake ML Jobs

Use Snowflake ML Jobs to run machine learning (ML) workflows inside Snowflake ML container runtimes. You can run them from any development environment. You don’t need to run the code in a Snowflake worksheet or notebook. Use jobs to leverage Snowflake’s infrastructure to run resource-intensive tasks within your development workflow. For information about setting up Snowflake ML locally, see Using Snowflake ML Locally.

Important

Snowflake ML Jobs are available as public preview in the Snowpark ML Python package (snowflake-ml-python) version 1.8.2 and later.

Snowflake ML Jobs enable you to do the following:

  • Run ML workloads on Snowflake Compute Pools, including GPU and high-memory CPU instances.

  • Use your preferred development environment such as VS Code or Jupyter notebooks.

  • Install and use custom Python packages within your runtime environment.

  • Use Snowflake’s distributed APIs to optimize data loading, training, and hyperparameter tuning.

  • Integrate with orchestration tools, such as Apache Airflow.

  • Monitor and manage jobs through Snowflake’s APIs.

You can use these capabilities to do the following:

  • Execute resource-intensive training on large datasets requiring GPU acceleration or significant compute resources.

  • Productionize ML workflows by moving ML code from development to production with programmatic execution through pipelines.

  • Retain your existing development environment while leveraging Snowflake’s compute resources.

  • Lift and shift OSS ML workflows with minimal code changes.

  • Work directly with large Snowflake datasets to reduce data movement and avoid expensive data transfers.

Prerequisites

Important

Snowflake ML Jobs currently only support Python 3.10 clients. Please contact your Snowflake account team if you need support for other Python versions.

  1. Install the Snowflake ML Python package in your Python 3.10 environment.

    pip install snowflake-ml-python>=1.8.2
    
    Copy
  2. The default compute pool size uses the CPU_X64_S instance family. The minimum number of nodes is 1 and the maximum is 25. You can use the following SQL command to create a custom compute pool:

    CREATE COMPUTE POOL IF NOT EXISTS MY_COMPUTE_POOL
      MIN_NODES = <MIN_NODES>
      MAX_NODES = <MAX_NODES>
      INSTANCE_FAMILY = <INSTANCE_FAMILY>;
    
    Copy
  3. Snowflake ML Jobs require a Snowpark Session. Use the following code to create it:

    from snowflake.snowpark import Session
    from snowflake.ml.jobs import list_jobs
    
    ls = list_jobs() # This will fail! You must create a session first.
    
    # Requires valid ~/.snowflake/config.toml file
    session = Session.builder.getOrCreate()
    
    ls = list_jobs(session=session)
    ls = list_jobs() # Infers created session from context
    
    Copy

    For information about creating a session, see Creating a Session.

Run a Snowflake ML job

You can run a Snowflake ML Job in one of the following ways:

  • Using a remote decorator within your code.

  • Submitting entire files or directories using the Python API.

Run a Python function as Snowflake ML Job

Use Function Dispatch to run individual Python functions remotely on Snowflake’s compute resources with the @remote decorator.

Using the @remote decorator, you can do the following:

  • Serializate the function and its dependencies.

  • Upload it to a specified Snowflake stage.

  • Execute it within the Container Runtime for ML.

The following example Python code uses the @remote decorator to submit a Snowflake ML Job. Note that a Snowpark Session is required, see Prerequisites.

from snowflake.ml.jobs import remote

@remote("MY_COMPUTE_POOL", stage_name="payload_stage", session=session)
def train_model(data_table: str):
  # Provide your ML code here, including imports and function calls
  ...

job = train_model("my_training_data")
Copy

Invoking a @remote decorated function returns a Snowflake MLJob object that can be used to manage and monitor the job execution. For more information, see Snowflake ML Job management.

Run a Python file as a Snowflake ML Job

Run Python files or project directories on Snowflake compute resources. This is useful when:

  • You have complex ML projects with multiple modules and dependencies.

  • You want to maintain separation between local development and production code.

  • You need to run scripts that use command-line arguments.

  • You’re working with existing ML projects that weren’t specifically designed for execution on Snowflake compute.

The Snowflake Job API offers two main methods:

  • submit_file(): For running single Python files

  • submit_directory(): For running entire project directories with multiple files and resources

Both methods support:

  • Command-line argument passing

  • Environment variable configuration

  • Custom dependency specification

  • Project asset management through Snowflake stages

File Dispatch is particularly useful for productionizing existing ML workflows and maintaining clear separation between development and execution environments.

The following Python code submits a file to a Snowflake ML Job:

from snowflake.ml.jobs import submit_file

# Run a single file
job1 = submit_file(
  "train.py",
  "MY_COMPUTE_POOL",
  stage_name="payload_stage",
  args=["--data-table", "my_training_data"],
  session=session,
)
Copy

The following Python code submits a directory to a Snowflake ML Job:

from snowflake.ml.jobs import submit_directory

# Run from a directory
job2 = submit_directory(
  "./ml_project/",
  "MY_COMPUTE_POOL",
  entrypoint="train.py",
  stage_name="payload_stage",
  session=session,
)
Copy

Submitting a file or directory returns a Snowflake MLJob object that can be used to manage and monitor the job execution. For more information, see Snowflake ML Job management.

Snowflake ML Job management

When you submit a Snowflake ML Job, the API creates an MLJob object. You can use it to do the following:

  • Track job progress through status updates

  • Debug issues using detailed execution logs

You can use the get_job() API to retrieve an MLJob object by its ID. The following Python code shows how to retrieve an MLJob object:

from snowflake.ml.jobs import MLJob, get_job, list_jobs

# List all jobs
jobs = list_jobs()

# Retrieve an existing job based on ID
job = get_job("<job_id>")  # job is an MLJob instance

# Retrieve status and logs for the retrieved job
print(job.status)  # PENDING, RUNNING, FAILED, DONE
print(job.get_logs())
Copy

Managing dependencies

The Snowflake ML Job API runs payloads inside the Container Runtime for ML environment. The environment has the most commonly used Python packages for machine learning and data science. Most use cases should work “out of the box” without additional configuration. If you need custom dependencies, you can use pip_requirements to install them.

To install custom dependencies, you must enable external network access using an External Access Integration. You can use the following SQL example command to provide access:

CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION PYPI_EAI
  ALLOWED_NETWORK_RULES = (snowflake.external_access.pypi_rule)
  ENABLED = true;
Copy

For more information about external access integrations, see Creating and using an external access integration.

After you’ve provided external network access, you can use the pip_requirements and external_access_integrations parameters to configure custom dependencies. You can use packages that aren’t available in the container runtime environment or if you specific versions of the packages.

The following Python code shows how to specify custom dependencies to the remote decorator:

@remote(
  "MY_COMPUTE_POOL",
  stage_name="payload_stage",
  pip_requirements=["custom-package"],
  external_access_integrations=["PYPI_EAI"],
  session=session,
)
def my_function():
  # Your code here
Copy

The following Python code shows how to specify custom dependencies for the submit_file() method:

from snowflake.ml.jobs import submit_file

# Can include version specifier to specify version(s)
job = submit_file(
  "/path/to/repo/my_script.py",
  compute_pool,
  stage_name="payload_stage",
  pip_requirements=["custom-package==1.0.*"],
  external_access_integrations=["pypi_eai"],
  session=session,
)
Copy

Private package feeds

Snowflake ML Jobs also support loading packages from private feeds such as JFrog Artifactory and Sonatype Nexus Repository. These feeds are commonly used to distribute internal and proprietary packages, maintain control over dependency versions, and ensure security/compliance.

To install packages from a private feed, you must do the following:

  1. Create a Network Rule to allow access to the private feed’s URL.

    1. For sources which use basic authentication, you can simply create a network rule.

      CREATE OR REPLACE NETWORK RULE private_feed_nr
      MODE = EGRESS
      TYPE = HOST_PORT
      VALUE_LIST = ('<your-repo>.jfrog.io');
      
      Copy
    2. To configure access to a source using private connectivity (i.e. Private Link), follow the steps in Network egress using private connectivity.

  2. Create an External Access Integration using the network rule. Grant permission to use the EAI to the role that will be submitting jobs.

    CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION private_feed_eai
    ALLOWED_NETWORK_RULES = (PRIVATE_FEED_NR)
    ENABLED = true;
    
    GRANT USAGE ON INTEGRATION private_feed_eai TO ROLE <role_name>;
    
    Copy
  3. Specify the private feed URL, External Access Integration, and package(s) when submitting the job

    # Option 1: Specify private feed URL in pip_requirements
    job = submit_file(
      "/path/to/script.py",
      compute_pool="MY_COMPUTE_POOL",
      stage_name="payload_stage",
      pip_requirements=[
        "--index-url=https://your.private.feed.url",
        "internal-package==1.2.3"
      ],
      external_access_integrations=["PRIVATE_FEED_EAI"]
    )
    
    Copy
    # Option 2: Specify private feed URL by environment variable
    job = submit_directory(
      "/path/to/code/",
      compute_pool="MY_COMPUTE_POOL",
      entrypoint="script.py",
      stage_name="payload_stage",
      pip_requirements=["internal-package==1.2.3"],
      external_access_integrations=["PRIVATE_FEED_EAI"],
      env_vars={'PIP_INDEX_URL': 'https://your.private.feed.url'},
    )
    
    Copy

If your private feed URL contains sensitive information like authentication tokens, manage the URL by creating a Snowflake Secret. Use the CREATE SECRET to create a secret. Configure secrets during job submission with the :spec_overrides argument.

# Create secret for private feed URL with embedded auth token
feed_url = "<your-repo>.jfrog.io/artifactory/api/pypi/test-pypi/simple"
user = "<auth_user>"
token = "<auth_token>"
session.sql(f"""
CREATE SECRET IF NOT EXISTS PRIVATE_FEED_URL_SECRET
 TYPE = GENERIC_STRING
 SECRET_STRING = 'https://{auth_user}:{auth_token}@{feed_url}'
""").collect()

# Prepare service spec override for mounting secret into job execution
spec_overrides = {
 "spec": {
  "containers": [
    {
     "name": "main",  # Primary container name is always "main"
     "secrets": [
      {
        "snowflakeSecret": "PRIVATE_FEED_URL_SECRET",
        "envVarName": "PIP_INDEX_URL",
        "secretKeyRef": "secret_string"
      },
     ],
    }
  ]
 }
}

# Load private feed URL from secret (e.g. if URL includes auth token)
job = submit_file(
  "/path/to/script.py",
  compute_pool="MY_COMPUTE_POOL",
  stage_name="payload_stage",
  pip_requirements=[
    "internal-package==1.2.3"
  ],
  external_access_integrations=["PRIVATE_FEED_EAI"],
  spec_overrides=spec_overrides,
)
Copy

For more information about the container.secrets, see containers.secrets field.

Cost considerations

Snowflake ML Jobs run on Snowpark Container Services and are billed based on usage. For information about compute costs, see Snowpark Container Services costs.

Job payloads are uploaded to the stage specified with the stage_name argument. To avoid additional charges, you must clean them up. For more information, see Understanding storage cost and Exploring storage cost to learn more about costs associated with stage storage.