Notebooks on Container Runtime for ML

Overview

You can run Snowflake Notebooks on Container Runtime for ML. Container Runtime for ML is powered by Snowpark Container Services, giving you a flexible container infrastructure that supports building and operationalizing a wide variety of workflows entirely within Snowflake. Container Runtime for ML provides software and hardware options to support advanced data science and machine learning workloads. Compared to virtual warehouses, Container Runtime for ML provides a more flexible compute environment where you can install packages from multiple sources and select compute resources, including GPU machine types, while still running SQL queries on warehouses for optimal performance.

This document describes some considerations for using notebooks on Container Runtime for ML. You can also try the Getting Started with Snowflake Notebook Container Runtime quickstart to learn more about using the Container Runtime for ML in your development.

Prerequisites

Before you start using Snowflake Notebooks on Container Runtime for ML, the ACCOUNTADMIN role must complete the notebook setup steps for creating the necessary resources and granting privileges to those resources. For detailed steps, see User setup instructions for Snowflake Notebooks.

Create a notebook on Container Runtime for ML

When you create a notebook on Container Runtime for ML, you choose a warehouse, runtime, and compute pool to provide the resources to run your notebook. The runtime you choose gives you access to different Python packages based on your use case. Different warehouse sizes or compute pools have different cost and performance implications. All of these settings can be changed later if needed.

Note

A user with the ACCOUNTADMIN, ORGADMIN, or SECURITYADMIN roles cannot directly create or own a notebook on Container Runtime for ML. Notebooks created or directly owned by these roles will fail to run. However, if a notebook is owned by a role that the ACCOUNTADMIN, ORGADMIN, or SECURITYADMIN roles inherit privileges from, such as the PUBLIC role, then you can use those roles to run that notebook.

To create a Snowflake Notebook to run on Container Runtime for ML, follow these steps:

  1. Sign in to Snowsight.

  2. Select Notebooks.

  3. Select + Notebook.

  4. Enter a name for your notebook.

  5. Select a database and schema in which to store your notebook. These cannot be changed after you create the notebook.

    Note

    The database and schema are only required for storing your notebooks. You can query any database and schema your role has access to from within your notebook.

  6. Select the Run on container as your Python environment.

  7. Select the Runtime type: CPU or GPU.

  8. Select a Compute pool.

  9. Change the selected warehouse to use to run SQL and Snowpark queries.

    For guidance on what size warehouse to use, see Warehouse recommendations for running Snowflake Notebooks.

  10. To create and open your notebook, select Create.

Runtime:

This preview provides two types of runtimes: CPU and GPU. Each runtime image contains a base set of Python packages and versions verified and integrated by Snowflake. All runtime images support data analysis, modeling, and training with Snowpark Python, Snowpark ML, and Streamlit.

To install additional packages from a public repo, you can use pip. An external access integration (EAI) is required for Snowflake Notebooks to install packages from external endpoints. To configure EAIs, see Set up external access for Snowflake Notebooks. However, if a package is already part of the base image, then you can’t change the version on the package by installing a different version with pip install. For a list of the pre-installed packages, see Container Runtime for ML.

Compute pool:

A compute pool provides the compute resources for your notebook kernel and Python code. Use smaller, CPU-based compute pools to get started, and select higher-memory, GPU-based compute pools to optimize for intensive GPU usage scenarios like computer vision or LLMs/VLMs.

Note that each compute node is limited to running one notebook per user at a time. You should set the MAX_NODES parameter to a value greater than one when creating compute pools for notebooks. For an example, see Create compute resources. For more details on Snowpark Container Services compute pools, see Snowpark Container Services: Working with compute pools.

When a notebook is not being used, consider shutting it down to free up node resources. You can shut down a notebook by selecting End session in the connection dropdown button.

Run a notebook on Container Runtime for ML

After you create your notebook, you can start running code immediately by adding and running cells. For information about adding cells, see Develop and run code in Snowflake Notebooks.

Importing more packages

In addition to pre-installed packages to get your notebook up and running, you can install packages from public sources that you have external access set up for. You can also use packages stored in a stage or a private repository. You need use the ACCOUNTADMIN role or a role that can create external access integrations (EAIs) to set up and grant you access for visiting specific external endpoints. Use the ALTER NOTEBOOK command to enable external access on your notebook. Once granted, you will see the EAIs in Notebook settings. Toggle the EAIs before you start installing from external channels. For instructions, see Provision external access integration.

The following example installs an external package using pip install in a code cell:

!pip install transformers scipy ftfy accelerate
Copy

Update notebook settings

You can update settings, such as which compute pools or warehouse to use, any time in Notebook settings, which can be accessed through the more actions for worksheet Notebook actions menu at the top right.

One of the settings you can update in Notebook settings is the idle timeout setting. The default for idle timeout is 1 hour, and you can set it for up to 72 hours. To set this in SQL, use the CREATE NOTEBOOK or ALTER NOTEBOOK command to set the IDLE_AUTO_SHUTDOWN_TIME_SECONDS property of the notebook.

Running ML workloads example

Notebooks on Container Runtime for ML are well suited for running ML workloads such as model training and parameter tuning. Runtimes come pre-installed with common ML packages. With EAI set up, you can install any other packages you need using !pip install.

Note

The Python process caches loaded modules. Change versions of installed packages before importing the packages in code. Otherwise, you might have to disconnect and reconnect to the notebook session to refresh the session.

The following examples show how to use some of the available libraries for your ML workload.

Use OSS ML libraries

The following example uses an OSS ML library, xgboost, with an active Snowpark session to fetch data directly into memory for training:

from snowflake.snowpark.context import get_active_session
import pandas as pd
import xgboost as xgb
from sklearn.model_selection import train_test_split

session = get_active_session()
df = session.table("my_dataset")
# Pull data into local memory
df_pd = df.to_pandas()
X = df_pd[['feature1', 'feature2']]
y = df_pd['label']
# Split data into test and train in memory
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=34)
# Train in memory
model = xgb.XGBClassifier()
model.fit(X_train, y_train)
# Predict
y_pred = model.predict(X_test)
Copy

Use Snowpark ML modeling APIs

When Snowflake’s ML modeling APIs are used on Container Runtime for ML, all execution (including training and prediction) happens on the container runtime directly instead of pushing down to the query warehouse. Snowflake ML on Container Runtime can pull data faster and is recommended for larger scale training. With the GPU runtime, Snowflake ML will by default use all GPUs to accelerate training.

The following code block example uses XGBoost for modeling:

from snowflake.snowpark.context import get_active_session
from snowflake.ml.modeling.xgboost import XGBClassifier
from snowflake.ml.modeling.metrics import accuracy_score

session = get_active_session()
df = session.table("my_dataset")
feature_cols=['FEATURE1', 'FEATURE2']
label_col = 'LABEL'
predicted_col = [PREDICTED_LABEL]
df = df[features_cols + [label_col]]
# Split is pushed down to associated warehouse
train_df, test_df = df.random_split(weights=[0.85, 0.15], seed=34)

model = XGBClassifier(
    input_cols=feature_cols,
    label_cols=label_col,
    ouput_cols=predicted_col,
    # This will enable leveraging all GPUs on the node
    tree_method="gpu_hist",
)
# Train
model.fit(train_df)
# Predict
result = model.predict(test_df)

accuracy = accuracy_score(
df=result,
y_true_col_names=label_cols,
y_pred_col_names=predicted_col)
Copy

The following is an example using Light Gradient Boosting Machine (LightGBM):

from snowflake.snowpark.context import get_active_session
from snowflake.ml.modeling.lightgbm import LGBMClassifier
from snowflake.ml.modeling.metrics import accuracy_score

session = get_active_session()
df = session.table("my_dataset")
feature_cols=['FEATURE1', 'FEATURE2']
label_col = 'LABEL'
predicted_col = [PREDICTED_LABEL]

df = df[features_cols + [label_col]]
# Split is pushed down to associated warehouse
train_df, test_df = df.random_split(weights=[0.85, 0.15], seed=34)

model = LGBMClassifier(
    input_cols=feature_cols,
    label_cols=label_col,
    ouput_cols=predicted_col,
    # This will enable leveraging all GPUs on the node
    device_type="gpu",
)

# Train
model.fit(train_df)
# Predict
result = model.predict(test_df)

accuracy = accuracy_score(
df=result,
y_true_col_names=label_cols,
y_pred_col_names=predicted_col)
Copy

Use new container-optimized libraries

Container Runtime for ML pre-installs new APIs tailored specifically for ML training in the container environment. The first of these is the data connector APIs, which provides a single interface for connecting Snwoflake data sources (including tables, DataFrames, and Datasets) to popular ML frameworks (such as PyTorch and TensorFlow). These are found in the snowflake.ml.data package. More container-optimized APIs will follow.

Limitations

  • Snowpark ML Modeling API supports only predict, predict_proba, predict_log_proba inference methods on the Container Runtime. Other methods run in the query warehouse.

  • Snowpark ML Modeling API supports only sklearn compatible pipelines on the Container Runtime.

  • Snowpark ML Modeling API does not support preprocessing or metrics classes on Container Runtime for ML. These run in the query warehouse.

  • The fit, predict, and score methods are executed on Container Runtime for ML. Other Snowpark ML methods run in the query warehouse.

  • sample_weight_cols is not supported for XGBoost or LightGBM models.

Cost/billing considerations

When running notebooks on Container Runtime for ML, you may incur both warehouse compute and SPCS compute costs.

Snowflake Notebooks require a virtual warehouse to run SQL and Snowpark queries for optimized performance. Therefore, you might also incur virtual warehouse compute costs if you use SQL in SQL cells and Snowpark push-down queries executed in Python cells. The following diagram shows where compute happens for each type of cell.

Diagram showing the compute distribution of notebook cells.

For example, the following Python example uses the xgboost library. The data is pulled into the container and compute occurs on Snowpark Container Services:

from snowflake.snowpark.context import get_active_session
import pandas as pd
import xgboost as xgb
from sklearn.model_selection import train_test_split

session = get_active_session()
df = session.table("my_dataset")
# Pull data into local memory
df_pd = df.to_pandas()
X = df_pd[['feature1', 'feature2']]
y = df_pd['label']
# Split data into test and train in memory
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=34)
Copy

To learn more about warehouse costs, see Overview of warehouses and Warehouse recommendations for running Snowflake Notebooks.