Develop and run code in Snowflake Notebooks¶
This topic describes how to write and run SQL, Python, and Markdown code in Snowflake Notebooks.
Notebook cell basics¶
This section introduces some basic cell operations. When you create a notebook, three example cells are displayed. You can modify those cells or add new ones.
Create a new cell¶
Snowflake Notebooks support three types of cells: SQL, Python, and Markdown. To create a new cell, you can either hover over an existing cell or scroll to the bottom of the notebook, then select one of the buttons for the cell type you want to add.
You can change the language of the cell any time after it’s created by using one of two methods:
Select the language dropdown menu and then select a different language.
Move cells¶
You can move a cell either by dragging and dropping the cell using your mouse or by using the actions menu:
(Option 1) Hover your mouse over the existing cell you want to move. Select the (drag and drop) icon on the left side of the cell and move the cell to its new location.
(Option 2) Select the vertical ellipsis (actions) menu. Then select the appropriate action.
Note
To just move the focus between cells, use the Up and Down arrows.
Delete a cell¶
To delete a cell, complete the following steps in a notebook:
You can use a keyboard shortcut to delete a cell as well.
For considerations when using Python and SQL cells, see Considerations for running notebooks.
Run cells in Snowflake Notebooks¶
To run Python and SQL cells in Snowflake Notebooks, you can:
Run a single cell: Choose this option when making frequent code updates.
Run all cells in a notebook in sequential order: Choose this option before presenting or sharing a notebook to ensure that the recipients see the most current information.
Press CMD + shift + return on a Mac keyboard, or CTRL + Shift + Enter on a Windows keyboard.
Select Run all.
Run a cell and advance to the next cell: Choose this option to run a cell and move on to the next cell more quickly.
Run all above: Choose this option when running a cell that references the results of earlier cells.
Run all below: Choose this option when running a cell that later cells depend on. This option runs the current cell and all following cells.
When one cell is running, other run requests are queued and will be executed once the actively running cell finishes.
Inspect cell status¶
The status of the cell run is indicated by the colors displayed by the cell. This status color is displayed in two places, the left wall of the cell and in the right cell navigation map.
Cell status color:
Blue dot: The cell was modified but hasn’t run yet.
Red: The cell ran in the current session and an error occurred.
Green: The cell ran in the current session without errors.
Moving green: The cell is currently running.
Gray: The cell has run in a previous session and the results shown are from the previous session. Cell results from the previous interactive session are kept for 7 days. Interactive session means the user runs the notebook in an interactive manner in Snowsight rather than those that were run by a schedule or the EXECUTE NOTEBOOK SQL command.
Blinking gray: The cell is waiting to be run after you select Run All.
Note
Markdown cells do not show any status.
After a cell finishes running, the time it took to run is displayed at the top of the cell. Select this text to view the run details, including start and end times and total elapsed time.
SQL cells contain additional information, such as the warehouse used to run the query, rows returned, and a hyperlink to the query ID page.
Stop a running cell¶
To stop the execution of any code cells that are currently running, select Stop on the top right of the cell. You can also select Stop on the top right of the Notebooks page. While cells are running, Run all becomes Stop.
This stops the execution of the cell that is currently running and all subsequent cells that have been scheduled to run.
Format text with Markdown¶
To include Markdown in your notebook, add a Markdown cell:
Use a keyboard shortcut and select Markdown, or select + Markdown.
Select the Edit markdown pencil icon or double click on the cell, and start writing Markdown.
You can type valid Markdown to format a text cell. As you type, the formatted text appears below the Markdown syntax.
To view only the formatted text, select the Done editing checkmark icon.
Note
Markdown cells currently do not support rendering of HTML.
Markdown basics¶
This section describes basic Markdown syntax to get you started.
Headers
Heading level |
Markdown syntax |
Example |
---|---|---|
Top level |
# Top-level Header
|
|
2nd-level |
## 2nd-level Header
|
|
3rd-level |
### 3rd-level Header
|
Inline text formatting
Text format |
Markdown syntax |
Example |
---|---|---|
Italics |
*italicized text*
|
|
Bold |
**bolded text**
|
|
Link |
[Link text](url)
|
Lists
List type |
Markdown syntax |
Example |
---|---|---|
Ordered list |
1. first item
2. second item
1. Nested first
2. Nested second
|
|
Unordered list |
- first item
- second item
- Nested first
- Nested second
|
Code formatting
Language |
Markdown syntax |
Example |
---|---|---|
Python |
```python
import pandas as pd
df = pd.DataFrame([1,2,3])
```
|
|
SQL |
```sql
SELECT * FROM MYTABLE
```
|
Embed images
File type |
Markdown syntax |
Example |
---|---|---|
Image |
![<alt_text>](<path_to_image>)
|
For a notebook that demonstrates these Markdown examples, see the Markdown cells section of the visual data stories notebook.
Understanding cell outputs¶
When you run a Python cell, the notebook displays the following types of output from the cell are displayed in the results:
Any results written to the console, such as logs, errors, and warnings and output from print() statements.
DataFrames are automatically printed with Streamlit’s interactive table display,
st.dataframe()
.The supported DataFrame display types include pandas DataFrame, Snowpark DataFrames, and Snowpark Tables.
For Snowpark, printed DataFrames are evaluated eagerly without the need to run the
.show()
command. If you prefer not to evaluate the DataFrame eagerly, for example when running the notebook in non-interactive mode, Snowflake recommends removing the DataFrame print statements to speed up the overall runtime of your Snowpark code.
Visualizations are rendered in outputs. To learn more about visualizing your data, see Visualize data in Snowflake Notebooks.
Additionally, you can access the results of your SQL query in Python and vice versa. See Reference cells and variables in Snowflake Notebooks.
Cell output limits¶
Only 10,000 rows or 8 MB of DataFrame output is shown as cell results, whichever is lower. However, the entire DataFrame is still available in the notebook session for use. For example, even though the entire DataFrame isn’t rendered, you can still perform data transformation tasks.
For each cell, only 20 MB of output is allowed. If the size of the cell output exceeds 20 MB, the output will be dropped. Consider splitting the content into multiple cells if that happens.
Reference cells and variables in Snowflake Notebooks¶
You can reference the previous cell results in a notebook cell. For example, to reference the result of a SQL cell or the value of a Python variable, see the following tables:
Note
The cell name of the reference is case-sensitive and must exactly match the name of the referenced cell.
Referencing SQL output in Python cells:
Reference cell type |
Current cell type |
Reference syntax |
Example |
---|---|---|---|
SQL |
Python |
|
Convert a SQL results table to a Snowpark DataFrame. If you have the following in a SQL cell called SELECT 'FRIDAY' as SNOWDAY, 0.2 as CHANCE_OF_SNOW
UNION ALL
SELECT 'SATURDAY',0.5
UNION ALL
SELECT 'SUNDAY', 0.9;
You can reference the cell to access the SQL result: snowpark_df = cell1.to_df()
Convert the result to a pandas DataFrame: my_df = cell1.to_pandas()
|
Referencing variables in SQL code:
Important
In SQL code, you can only reference Python variables of type string
. You cannot reference a Snowpark DataFrame, pandas DataFrame or
other Python native DataFrame format.
Reference cell type |
Current cell type |
Reference syntax |
Example |
---|---|---|---|
SQL |
SQL |
|
For example, in a SQL cell named SELECT * FROM {{cell2}} where PRICE > 500
|
Python |
SQL |
|
For example, in a Python cell named Using Python variable as a value c = "USA"
You can reference the value of the variable SELECT * FROM my_table WHERE COUNTRY = '{{c}}'
Using Python variable as an identifier If the Python variable represents a SQL identifier like a column or table name: column_name = "COUNTRY"
If the Python variable represents a SQL identifier, such as a column or table name ( SELECT * FROM my_table WHERE {{column_name}} = 'USA'
Make sure to differentiate between variables used as values (with quotes) and as identifiers (without quotes). Note: Referencing Python DataFrames is not supported. |
Considerations for running notebooks¶
Notebooks run using caller’s rights. For additional considerations, see Changing the session context for a notebook.
You can import Python libraries to use in a notebook. For details, see Import Python packages to use in notebooks.
When referencing objects in SQL cells, you must use fully qualified object names, unless you are referencing object names in a specified database or schema. See Changing the session context for a notebook.
Notebook drafts are saved every three seconds.
You can use Git integration to maintain notebook versions.
You can configure an idle timeout setting to automatically shut down the notebook session once the setting is met. For information, see Idle time and reconnection.
Notebook cell results are only visible to the user who ran the notebook and are cached across sessions. Reopening a notebook displays past results from the last time the user ran the notebook using Snowsight.
BEGIN … END (Snowflake Scripting) is not supported in SQL cells. Instead, use the Session.sql().collect() method in a Python cell to run the scripting block. Chain the
sql
call with a call tocollect
to immediately execute the SQL query.The following code runs a Snowflake scripting block using the
session.sql().collect()
method:from snowflake.snowpark.context import get_active_session session = get_active_session() code_to_run = """ BEGIN CALL TRANSACTION_ANOMALY_MODEL!DETECT_ANOMALIES( INPUT_DATA => SYSTEM$REFERENCE('TABLE', 'ANOMALY_INFERENCE'), TIMESTAMP_COLNAME =>'DATE', TARGET_COLNAME => 'TRANSACTION_AMOUNT', CONFIG_OBJECT => {'prediction_interval': 0.95} ); LET x := SQLID; CREATE TABLE ANOMALY_PREDICTIONS AS SELECT * FROM TABLE(RESULT_SCAN(:x)); END; """ data = session.sql(code_to_run).collect(block=True);