Snowflake ML Jobs¶

Use Snowflake ML Jobs to run machine learning (ML) workflows inside Snowflake ML container runtimes. You can run them from any development environment. You don’t need to run the code in a Snowflake worksheet or notebook. Use jobs to leverage Snowflake’s infrastructure to run resource-intensive tasks within your development workflow. For information about setting up Snowflake ML locally, see Using Snowflake ML Locally.

Important

Snowflake ML Jobs are available in snowflake-ml-python version 1.9.2 and later.

Snowflake ML Jobs enable you to do the following:

Run ML workloads on Snowflake Compute Pools, including GPU and high-memory CPU instances.
Use your preferred development environment such as VS Code or Jupyter notebooks.
Install and use custom Python packages within your runtime environment.
Use Snowflake’s distributed APIs to optimize data loading, training, and hyperparameter tuning.
Integrate with orchestration tools, such as Apache Airflow.
Monitor and manage jobs through Snowflake’s APIs.

You can use these capabilities to do the following:

Execute resource-intensive training on large datasets requiring GPU acceleration or significant compute resources.
Productionize ML workflows by moving ML code from development to production with programmatic execution through pipelines.
Retain your existing development environment while leveraging Snowflake’s compute resources.
Lift and shift OSS ML workflows with minimal code changes.
Work directly with large Snowflake datasets to reduce data movement and avoid expensive data transfers.

Prerequisites¶

Important

Snowflake ML Jobs currently only support Python 3.10 clients. Please contact your Snowflake account team if you need support for other Python versions.

Install the Snowflake ML Python package in your Python 3.10 environment.
```
pip install snowflake-ml-python>=1.9.2
```
Copy
The default compute pool size uses the CPU_X64_S instance family. The minimum number of nodes is 1 and the maximum is 25. You can use the following SQL command to create a custom compute pool:
```
CREATE COMPUTE POOL IF NOT EXISTS MY_COMPUTE_POOL
  MIN_NODES = <MIN_NODES>
  MAX_NODES = <MAX_NODES>
  INSTANCE_FAMILY = <INSTANCE_FAMILY>;
```
Copy

Snowflake ML Jobs require a Snowpark Session. Use the following code to create it:

from snowflake.snowpark import Session
from snowflake.ml.jobs import list_jobs

ls = list_jobs() # This will fail! You must create a session first.

# Requires valid ~/.snowflake/config.toml file
session = Session.builder.getOrCreate()

ls = list_jobs(session=session)
ls = list_jobs() # Infers created session from context

Copy

For information about creating a session, see Creating a Session.

Run a Snowflake ML job¶

You can run a Snowflake ML Job in one of the following ways:

Using a function decorator within your code.
Submitting entire files or directories using the Python API.

Run a Python function as a Snowflake ML Job¶

Use Function Dispatch to run individual Python functions remotely on Snowflake’s compute resources with the @remote decorator.

Using the @remote decorator, you can do the following:

Serializate the function and its dependencies.
Upload it to a specified Snowflake stage.
Execute it within the Container Runtime for ML.

The following example Python code uses the @remote decorator to submit a Snowflake ML Job. Note that a Snowpark Session is required, see Prerequisites.

from snowflake.ml.jobs import remote

@remote("MY_COMPUTE_POOL", stage_name="payload_stage", session=session)
def train_model(data_table: str):
  # Provide your ML code here, including imports and function calls
  ...

job = train_model("my_training_data")

Copy

Invoking a @remote decorated function returns a Snowflake MLJob object that can be used to manage and monitor the job execution. For more information, see Managing ML Jobs.

Run a Python file as a Snowflake ML Job¶

Run Python files or project directories on Snowflake compute resources. This is useful when:

You have complex ML projects with multiple modules and dependencies.
You want to maintain separation between local development and production code.
You need to run scripts that use command-line arguments.
You’re working with existing ML projects that weren’t specifically designed for execution on Snowflake compute.

The Snowflake Job API offers three main methods for submitting file-based payloads:

submit_file(): For running single Python files
submit_directory(): For running Python projects spanning multiple files and resources
submit_from_stage(): For running Python projects saved on a Snowflake stage

Both methods support:

Command-line argument passing
Environment variable configuration
Custom dependency specification
Project asset management through Snowflake stages

File Dispatch is particularly useful for productionizing existing ML workflows and maintaining clear separation between development and execution environments.

The following Python code submits a file as a Snowflake ML Job:

from snowflake.ml.jobs import submit_file

# Run a single file
job1 = submit_file(
  "train.py",
  "MY_COMPUTE_POOL",
  stage_name="payload_stage",
  args=["--data-table", "my_training_data"],
  session=session,
)

Copy

The following Python code submits a directory as a Snowflake ML Job:

from snowflake.ml.jobs import submit_directory

# Run from a directory
job2 = submit_directory(
  "./ml_project/",
  "MY_COMPUTE_POOL",
  entrypoint="train.py",
  stage_name="payload_stage",
  session=session,
)

Copy

The following Python code submits a directory from a Snowflake Stage as a Snowflake ML Job:

from snowflake.ml.jobs import submit_from_stage

# Run from a directory
job3 = submit_from_stage(
  "@source_stage/ml_project/"
  "MY_COMPUTE_POOL",
  entrypoint="@source_stage/ml_project/train.py",
  stage_name="payload_stage",
  session=session,
)

# Entrypoint may also be a relative path
job4 = submit_from_stage(
  "@source_stage/ml_project/",
  "MY_COMPUTE_POOL",
  entrypoint="train.py",  # Resolves to @source_stage/ml_project/train.py
  stage_name="payload_stage",
  session=session,
)

Copy

Submitting a file or directory returns a Snowflake MLJob object that can be used to manage and monitor the job execution. For more information, see Managing ML Jobs.

Supporting Additional Payloads in Submissions¶

When submitting a file, directory, or from a stage, additional payloads are supported for use during job execution. The import path can be specified explicitly; otherwise, it will be inferred from the location of the additional payload.

Important

Only directories can be specified as import sources. Importing individual files is not supported.

# Run from a file
 job1 = submit_file(
   "train.py",
   "MY_COMPUTE_POOL",
   stage_name="payload_stage",
   session=session,
   additional_payloads=[
     ("src/utils/", "utils"), # the import path is utils
   ],
 )

 # Run from a directory
 job2 = submit_directory(
   "./ml_project/",
   "MY_COMPUTE_POOL",
   entrypoint="train.py",
   stage_name="payload_stage",
   session=session,
   additional_payloads=[
     ("src/utils/"), # the import path is utils
   ],
 )

 # Run from a stage
 job3 = submit_from_stage(
   "@source_stage/ml_project/",
   "MY_COMPUTE_POOL",
   entrypoint="@source_stage/ml_project/train.py",
   stage_name="payload_stage",
   session=session,
   additional_payloads=[
     ("@source_stage/src/utils/sub_utils/", "utils.sub_utils"),
   ],
 )

Copy

Accessing Snowpark Session in ML Jobs¶

When running ML Jobs on Snowflake, a Snowpark Session is automatically available in the execution context. You can access the Session object from within your ML Job payload using the following approaches:

from snowflake.ml.jobs import remote
from snowflake.snowpark import Session

@remote("MY_COMPUTE_POOL", stage_name="payload_stage")
def my_function():
  # This approach works for all payload types, including file and directory payloads
  session = Session.builder.getOrCreate()
  print(session.sql("SELECT CURRENT_VERSION()").collect())

@remote("MY_COMPUTE_POOL", stage_name="payload_stage")
def my_function_with_injected_session(session: Session):
  # This approach works only for function dispatch payloads
  # The session is injected automatically by the Snowflake ML Job API
  print(session.sql("SELECT CURRENT_VERSION()").collect())

Copy

The Snowpark Session can be used to access Snowflake tables, stages, and other database objects inside your ML Job.

Returning results from ML Jobs¶

Snowflake ML Jobs support returning execution results back to the client environment. This enables you to retrieve computed values, trained models, or any other artifacts produced by your job payloads.

For function dispatch, simply return a value from your decorated function. The returned value will be serialized and made available through the result() method.

from snowflake.ml.jobs import remote

@remote("MY_COMPUTE_POOL", stage_name="payload_stage")
def train_model(data_table: str):
  # Your ML code here
  model = XGBClassifier()
  model.fit(data_table)
  return model

job1 = train_model("my_training_data")

Copy

For file-based jobs, use the special __return__ variable to specify the return value.

# Example: /path/to/repo/my_script.py
def main():
    # Your ML code here
    model = XGBClassifier()
    model.fit(data_table)
    return model

if __name__ == "__main__":
    __return__ = main()

Copy

from snowflake.ml.jobs import submit_file

job2 = submit_file(
    "/path/to/repo/my_script.py",
    "MY_COMPUTE_POOL",
    stage_name="payload_stage",
    session=session,
)

Copy

You can retrieve the job execution result using the MLJob.result() API. The API blocks the calling thread until the job reaches a terminal state, then returns the payload’s return value or, if execution failed, raises an exception. If the payload does not define a return value, the result will be None on success.

# These will block until the respective job is done and return the trained model
model1 = job1.result()
model2 = job2.result()

Copy

Managing ML Jobs¶

When you submit a Snowflake ML Job, the API creates an MLJob object. You can use it to do the following:

Track job progress through status updates
Debug issues using detailed execution logs
Retrieve the execution result (if any)

You can use the get_job() API to retrieve an MLJob object by its ID. The following Python code shows how to retrieve an MLJob object:

from snowflake.ml.jobs import MLJob, get_job, list_jobs, delete_job

# Get a list of the 10 most recent jobs as a Pandas DataFrame
jobs_df = list_jobs(limit=10)
print(jobs_df)  # Display list in table format

# Retrieve an existing job based on ID
job = get_job("<job_id>")  # job is an MLJob instance

# Retrieve status and logs for the retrieved job
print(job.status)  # PENDING, RUNNING, FAILED, DONE
print(job.get_logs())

# Clean up the job
delete_job(job)

Copy

Managing dependencies¶

The Snowflake ML Job API runs payloads inside the Container Runtime for ML environment. The environment has the most commonly used Python packages for machine learning and data science. Most use cases should work “out of the box” without additional configuration. If you need custom dependencies, you can use pip_requirements to install them.

To install custom dependencies, you must enable external network access using an External Access Integration. You can use the following SQL example command to provide access:

CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION PYPI_EAI
  ALLOWED_NETWORK_RULES = (snowflake.external_access.pypi_rule)
  ENABLED = true;

Copy

For more information about external access integrations, see Creating and using an external access integration.

After you’ve provided external network access, you can use the pip_requirements and external_access_integrations parameters to configure custom dependencies. You can use packages that aren’t available in the container runtime environment or if you specific versions of the packages.

The following Python code shows how to specify custom dependencies to the remote decorator:

@remote(
  "MY_COMPUTE_POOL",
  stage_name="payload_stage",
  pip_requirements=["custom-package"],
  external_access_integrations=["PYPI_EAI"],
  session=session,
)
def my_function():
  # Your code here

Copy

The following Python code shows how to specify custom dependencies for the submit_file() method:

from snowflake.ml.jobs import submit_file

# Can include version specifier to specify version(s)
job = submit_file(
  "/path/to/repo/my_script.py",
  compute_pool,
  stage_name="payload_stage",
  pip_requirements=["custom-package==1.0.*"],
  external_access_integrations=["pypi_eai"],
  session=session,
)

Copy

Private package feeds¶

Snowflake ML Jobs also support loading packages from private feeds such as JFrog Artifactory and Sonatype Nexus Repository. These feeds are commonly used to distribute internal and proprietary packages, maintain control over dependency versions, and ensure security/compliance.

To install packages from a private feed, you must do the following:

Create a Network Rule to allow access to the private feed’s URL.
1. For sources which use basic authentication, you can simply create a network rule.
  CREATE OR REPLACE NETWORK RULE private_feed_nr MODE = EGRESS TYPE = HOST_PORT VALUE_LIST = ('<your-repo>.jfrog.io');
  Copy
2. To configure access to a source using private connectivity (i.e. Private Link), follow the steps in Network egress using private connectivity.

Create an External Access Integration using the network rule. Grant permission to use the EAI to the role that will be submitting jobs.

CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION private_feed_eai
ALLOWED_NETWORK_RULES = (PRIVATE_FEED_NR)
ENABLED = true;

GRANT USAGE ON INTEGRATION private_feed_eai TO ROLE <role_name>;

Copy

Specify the private feed URL, External Access Integration, and package(s) when submitting the job

# Option 1: Specify private feed URL in pip_requirements
job = submit_file(
  "/path/to/script.py",
  compute_pool="MY_COMPUTE_POOL",
  stage_name="payload_stage",
  pip_requirements=[
    "--index-url=https://your.private.feed.url",
    "internal-package==1.2.3"
  ],
  external_access_integrations=["PRIVATE_FEED_EAI"]
)

Copy

# Option 2: Specify private feed URL by environment variable
job = submit_directory(
  "/path/to/code/",
  compute_pool="MY_COMPUTE_POOL",
  entrypoint="script.py",
  stage_name="payload_stage",
  pip_requirements=["internal-package==1.2.3"],
  external_access_integrations=["PRIVATE_FEED_EAI"],
  env_vars={'PIP_INDEX_URL': 'https://your.private.feed.url'},
)

Copy

If your private feed URL contains sensitive information like authentication tokens, manage the URL by creating a Snowflake Secret. Use the CREATE SECRET to create a secret. Configure secrets during job submission with the spec_overrides argument.

# Create secret for private feed URL with embedded auth token
feed_url = "<your-repo>.jfrog.io/artifactory/api/pypi/test-pypi/simple"
user = "<auth_user>"
token = "<auth_token>"
session.sql(f"""
CREATE SECRET IF NOT EXISTS PRIVATE_FEED_URL_SECRET
 TYPE = GENERIC_STRING
 SECRET_STRING = 'https://{auth_user}:{auth_token}@{feed_url}'
""").collect()

# Prepare service spec override for mounting secret into job execution
spec_overrides = {
 "spec": {
  "containers": [
    {
     "name": "main",  # Primary container name is always "main"
     "secrets": [
      {
        "snowflakeSecret": "PRIVATE_FEED_URL_SECRET",
        "envVarName": "PIP_INDEX_URL",
        "secretKeyRef": "secret_string"
      },
     ],
    }
  ]
 }
}

# Load private feed URL from secret (e.g. if URL includes auth token)
job = submit_file(
  "/path/to/script.py",
  compute_pool="MY_COMPUTE_POOL",
  stage_name="payload_stage",
  pip_requirements=[
    "internal-package==1.2.3"
  ],
  external_access_integrations=["PRIVATE_FEED_EAI"],
  spec_overrides=spec_overrides,
)

Copy

For more information about the container.secrets, see containers.secrets field.

Examples¶

See ML Jobs Code Samples for examples of how to use Snowflake ML Jobs.

Cost considerations¶

Snowflake ML Jobs run on Snowpark Container Services and are billed based on usage. For information about compute costs, see Snowpark Container Services costs.

Job payloads are uploaded to the stage specified with the stage_name argument. To avoid additional charges, you must clean them up. For more information, see Understanding storage cost and Exploring storage cost to learn more about costs associated with stage storage.