Snowflake Notebooks in Workspaces¶
Overview¶
A Snowflake notebook in Workspaces is fully-managed and built for end-to-end data science and machine learning development on Snowflake data. This new environment for notebooks includes:
Integrated Development Environment (IDE) features - Includes full IDE capabilities for streamlined file management and editing to improve your workflow.
Familiar Jupyter experience - Use a standard Jupyter Python notebook environment that connects directly to your Snowflake data while maintaining all governance controls.
Optimized for AI/ML workloads - Notebooks in Workspaces runs in a preconfigured container designed for scalable AI/ML development and includes fully-managed access to CPUs and GPUs, parallel data loading, and distributed training APIs for popular ML packages (for example, XGBoost, PyTorch, or LightGBM).
Governed collaboration - Supports simultaneous multi-user collaboration with built-in governance. Track all changes and maintain a complete history using Git or shared workspaces.
Benefits for machine learning (ML) workflows¶
Notebooks in Workspaces provides two primary capabilities for ML workflows.
End-to-end workflow - The platform enables users to consolidate their complete ML lifecycle, from source data access to model inference, within a single Jupyter notebook environment. This environment is integrated with the underlying data platform, allowing it to inherit existing governance and security controls for the data and code assets.
Scalable model development architecture - The architecture supports the development of scalable models by providing open-source software (OSS) model development capabilities. Users can access distributed data loading and training across designated CPU or GPU compute pools. This design simplifies ML infrastructure management by abstracting the need for manual configuration of distributed compute resources.
Governance and access control¶
To enable users to create and run Notebooks in Workspaces, which run exclusively on the Container Runtime (a service powered by Snowpark Container Services), specific privileges are required:
Action |
Required Privilege |
Granted On |
Notes |
|---|---|---|---|
Creation of notebook files |
OWNERSHIP |
The workspace |
Enables users to create |
Execution of notebooks |
USAGE |
The underlying compute pool |
Required for the notebook’s execution service to run. Compute pools provide the containerized compute for the notebook’s kernel. By default, the |
Notebook service management¶
When a user runs a notebook, a Snowflake-managed notebook service is dynamically created on a compute pool to host the Python kernel and facilitate execution. These services are personal to each user, can only be used to run notebooks, and are located within the user’s Personal Database (PDB).
Administrator control and cost monitoring¶
Administrators manage user access and monitor costs primarily through the associated compute pools.
Disable notebook execution: Administrators can disable the ability for users to run Notebooks in Workspaces by removing the USAGE privilege on compute pools for a user’s roles.
Drop services: Administrators can drop a notebook service via SQL:
DROP USER$DB_NAME.PUBLIC.[SERVICE_NAME];
Alternatively, administrators can use Snowsight:
Sign in to Snowsight.
In the navigation menu, select Monitoring » Services & jobs.
Notebooks in Workspaces features¶
The table below outlines the core features of Snowflake Notebooks in Workspaces and the purpose or benefit each feature provides. The new Notebooks experience offers enhanced performance, improved developer productivity, and Jupyter compatibility.
Integration with Workspaces¶
Feature |
Description |
|---|---|
Notebooks are files in Workspaces |
The Workspaces environment supports easy file management, allowing you to iterate on individual notebooks and project files. Create folders, upload files, and organize notebooks. Notebook files open in tabs in your workspace and are editable and executable. |
Git Workspaces integration |
Collaboration is streamlined by maintaining a single source of truth compatible with different development environments. Connect to a Git repo by creating a new workspace and selecting Workspaces » From Git repository. You can pull in your files, create and switch branches, and push changes back with diff resolution. |
Updates to compute and cost management¶
Feature |
Description |
|---|---|
Snowpark Container Services compute pools |
Optimizes cost and compute power by allowing users to select the exact CPU/GPU resources needed for the workload. For more details, see Notebook usage and cost monitoring. Access to CPU/GPU machine types. |
Shared container service connection |
Reduces notebook start-up time and improves resource utilization. After the first notebook connects to a container service, other notebooks can quickly connect to the same container service and share the compute resources of a single compute pool node. Each notebook still maintains its own virtual environment. |
Background kernel persistence |
Ensures uninterrupted execution of critical, long-running processes like ML training and data engineering jobs. Notebook kernels run until idle timeout, independent of frontend or client connection status. |
Simplified idle time configuration |
Simplifies cost management by preventing unused compute resources from running indefinitely. Idle time is configured on the container service, automatically shutting down the service after a defined period of inactivity. |
Service-level EAI management |
EAIs are configured once on the container service and apply to all notebooks in the same workspace. It’s no longer necessary to manually configure EAIs for each individual notebook. |
Jupyter compatibility¶
Feature |
Description |
|---|---|
Jupyter magic commands |
Provides a familiar development experience by leveraging standard Jupyter utilities and productivity features such as cell and line magics.
Use |
Package management¶
Feature |
Description |
|---|---|
Pre-installed data science (DS) and ML packages |
Provides a flexible and streamlined environment for immediate development without complex initial package installation. Popular packages are pre-installed in the Snowflake Runtime and can be directly imported. |
Install packages via requirements.txt |
Specify and install required package versions using Note If the package version specified in |
Install packages from PyPI or other repos |
Download packages directly using |
Install packages from Workspaces file upload |
Download or build |
Import from workspace |
Import modules from from my_utils import my_func
|
Import from stage |
Enables secure and governed package deployment by leveraging existing Snowflake data storage and governance controls for package files. Use the Snowpark session to retrieve package files from a Snowflake stage into the container environment for import and use. For example: from snowflake.snowpark import Session
import sys
session = Session.builder.getOrCreate()
session.file.get("@stage_name/math_tools.py","/tmp/")
sys.path.append("tmp/")
import math_tools
math_tools.add_one(3)
|
Updates to notebook editing¶
Feature |
Description |
|---|---|
Bidirectional SQL <> Python cell referencing |
Optimizes developer productivity by allowing seamless language switching and direct reuse of results and variables across cells. SQL
results can be directly referenced as |
Interactive datagrid and automated chart builder |
You can view, search, filter, and sort results on millions of records and generate charts without code. Provides a high-performance, consistent user experience with data manipulation and visualization across Workspaces editing surfaces. |
Enhanced minimap and cell status |
The minimap improves notebook organization and assists with debugging and navigation through clear section outlines and execution status tracking. A table of contents is generated from Markdown headers and displays a comprehensive, in-session status for each cell (running, succeeded, failed, and modified). |
Use comments to name code cells |
The first line comment in a Python or SQL cell is used as the cell’s name in the minimap, simplifying navigation and provides contextual labeling for cells within large notebooks. |
Classic Snowflake Notebooks vs. Notebooks in Workspaces¶
Feature |
Replacement |
|---|---|
Warehouse Runtime no longer supported |
Notebooks in Workspaces exclusively run on Container Runtime. Container Runtime provides a simplified user experience with the benefits of CPU/GPU compute. |
|
Use |
Anaconda packages no longer supported |
Use the pre-installed Container Runtime packages, use EAIs to install more packages
including PyPI, or use |
The |
It isn’t necessary to manually convert a SQL result to a DataFrame. Use |
Need to adjust context |
Notebooks are now in Workspaces, and you may need to explicitly set your role and warehouse using the dropdown in the upper-right corner of the Workspaces editor if you do not have defaults set. You also must set the context in a cell to query your data assets using the following SQL commands: USE DATABASE <database_name>;
USE SCHEMA <schema_name>;
You can also query using fully qualified names, for example: SELECT * FROM TABLE <database_name.schema_name.table_name>
|
Cell names |
Cell names are temporarily unavailable. To name a code cell, use a comment in the first line of the cell ( |
Limitations¶
Renaming your notebook file, other files, folders, or the workspace may cause unexpected behaviors such as getting disconnected from the service, clearing the notebook’s output cache, or delays in updating the referenced files. Try reconnecting your notebooks if you get disconnected. If you renamed the workspace, try creating and using a new service.
The current account limit is 200 active services. Notebooks in different workspaces cannot share the same service. By default, notebooks in the same workspace connect to the same service. However, users can also create more than one service per workspace and connect different notebooks to different services.
Notebook services may be restarted over the weekend for container service maintenance. Afterwards, you must rerun your notebooks and reinstall packages to restore variables and packages.
Sharing notebooks to different roles is not yet supported. Use Git-backed Workspaces to sync changes to your Git repo for collaboration.
Folders and files created from the command line (for example,
!mkdir, df.to_csv("my_table.csv", index=False))are only available on the same service until the service is suspended. Files written to the Workspace directory (starting with/workspace) do not appear in the Workspaces File Explorer (left pane in the Workspaces UI) or persist if the notebook is connected to a different service. To ensure file writebacks are persisted, save the files using the Snowpark file operation APIs to a Snowflake stage with write access.iPywidgets are not yet supported.
To embed an image in your notebook, upload it to your workspace and then display it using a Python cell. For example:
from IPython.display import Image, display display(Image(filename="path/to/example_image.png"))
Embedding images in Markdown cells and using remote images via URLs is not yet supported. For a cleaner presentation, we recommend collapsing code cells to show only the output results.
Visualization packages that rely on HTML rendering (such as
plotlyandaltair) are not yet supported.Custom container images and the artifact repository are not yet available for use with Notebooks in Workspaces.
Snowflake supports downloading packages using
uv pip install. However,uv pip freezewill only list packages installed this way. To see a complete list of packages, including those in the base image and those installed using the standardpip install, use thepip freezecommand.Enabling secondary roles is no longer required to use personal workspaces. If a user doesn’t have secondary roles set to ALL, they need to select a role that has OWNERSHIP or USAGE privileges on the compute pools and EAIs to create a service. However, if your account has session policies that prevent the use of secondary roles, you will not be able to use Notebooks within personal workspaces. Other personal workspace features, such as SQL files and Git integration, will still be available.
Ask your account representative to contact the Notebooks product team if you have any questions about when specific features will be available.
