Work with files in Snowflake Notebooks

This topic describes how you can upload and access files from your Snowflake Notebooks.

Files in notebook environments

When you create a new notebook, two files are created. You can view these in the Files pane on the left side of the notebook. The files are stored in an internal stage which represents your notebook environment. Files stored in this internal stage persist between sessions.

  • Main notebook file: By default, this is named notebook_app.ipynb. If your notebook is created from Git or uploaded from another .ipynb file, the filename may be different.

  • environment.yml: This is an autogenerated file that describes your notebook environment, such as which packages are installed.

To inspect the contents of the file, you can select the file name and a pop up will appear with a preview of the file content. Note that files are read-only. To modify the contents of a file, you will need to download it, edit it locally and then upload the updated copy.

Temporary filesystem in a notebook environment

Your notebook has a temporary filesystem that is available during an active session. Any files created during the session are saved in this temporary stage. Files on the temporary stage will not be available after you exit out of the current notebook session.

The following code creates a file called myfile.txt and writes some text in it:

with open("myfile.txt",'w') as f:
    f.write("abc")
f.close()
Copy

You can access this file during the same session it was created.

Use the listdir() method to list the files in the temporary stage:

import os
os.listdir()
Copy

Now disconnect from your current session and reconnect. Try the listdir() method again and myfile.txt file will not be listed.

Files persisted across Notebook sessions

To persist your files across Notebook sessions:

Store Files in a Snowflake stage

If you want your files to persist between sessions and reference the files across different notebooks, use a Snowflake stage to store them. You can upload files from your local machine onto the stage and use file operations from Snowpark API to access them from your notebook.

Example

This example shows how to create a stage and store and retrieve files from it from your notebook.

To create a stage called permanent_stage, run the follow code in a SQL cell:

CREATE OR REPLACE STAGE permanent_stage;
Copy

Next, to create a file called myfile.txt with some text in it, run the following code in a Python cell:

with open("myfile.txt",'w') as f:
  f.write("abc")
f.close()
Copy

Note that at this point, myfile.txt is stored in the Notebook’s temporary filesystem. To move this to the stage, you can use Snowpark API to upload the myfile.txt to your permanent_stage:

from snowflake.snowpark.context import get_active_session
session = get_active_session()

put_result = session.file.put("myfile.txt","@PERMANENT_STAGE", auto_compress= False)
put_result[0].status
Copy

If you disconnect your session and reconnect, you can run the following code in a SQL cell to see that the file is still there:

LS @permanent_stage;
Copy

Add Files to Notebook from local computer

You can upload files from your local computer to be used in your Snowflake notebook.

  1. Sign in to Snowsight.

  2. Select Projects » Notebooks.

  3. In the Files tab, next to the database object explorer, select the Add a dashboard tile icon to select files to upload.

  4. Browse and select or drag and drop files into the dialog.

  5. Select Upload to upload your file.

Uploaded files are saved to the notebook’s internal stage and persisted between sessions. You can reference uploaded files using their local paths from the notebook file. See Referencing Files in Notebooks.

Warning

If your notebook session is active when you uploaded the file, you will need to restart your notebook session for the uploaded file to be accessible. This is a known bug. Snowflake recommends adding all the files you need before starting your session for use in your notebook.

Sync with Files from Git

If your Notebook is connected to Git, then all the files in the same Git folder as your notebook will be displayed on the Files Tab.

For more information on working with files in Git, see Sync Snowflake Notebooks with a Git repository.

Referencing Files in Notebooks

Each file in the notebook environment has a stage path and a local path. You can use these paths to reference the file in the notebook.

Referencing local path with Python

In general, Python libraries uses the local path to the file as reference to the file. For example, the following code accesses the data.csv file that was uploaded to the same directory as the notebook that this code is running in:

import pandas as pd
df = pd.read_csv("data.csv")
Copy

Referencing Stage path with SQL

With SQL, Snowflake references files based on the stage path. The stage path for a file in your notebook is based on the following format:

snow://notebook/<DATABASE>.<SCHEMA>.<NOTEBOOK_NAME>/versions/live/<file_name>
Copy

To find the stage path associated with the files in your notebook stage using the Copy path menu:

  1. Sign in to Snowsight.

  2. Select Projects » Notebooks.

  3. In the Files tab, next to the database object explorer, select the More options icon next to the file you want to get the path for.

  4. Select Copy path. This copies the path of the file to your clipboard.

Then you can use the following SQL statement to list the stage file details:

LIST 'snow://notebook/<DATABASE>.<SCHEMA>.<NOTEBOOK_NAME>/versions/live/data.csv'
Copy

Access control requirements

You need to use a role with the following privileges to access files from a stage in a notebook.

Privilege

Object

USAGE

Stage that contains the files.

Limitations and considerations

  • Load files before starting your notebook session. If you load files after a session has started, you have to restart your session to access the files.

  • No restrictions on file types to upload.

  • The size limit per file is 250 MB or less.

  • Files that are written to a local path in the notebook does not show up on the Files UI. This is a known bug. However, you should still be able to use the file in your notebook code.

    For example, if you create a file, data.json, you can access it as shown in the following code even though it won’t be visible in the Files UI:

    # Generate sample JSON file
    with open("data.json", "w") as f:
        f.write('{"fruit":"apple", "size":3.4, "weight":1.4},{"fruit":"orange", "size":5.4, "weight":3.2}')
    # Read from local JSON file (File doesn't show in UI)
    df = pd.read_json("data.json",lines=True)
    df
    
    Copy
  • Opening another .ipynb file that is not the main notebook file is not supported.

Additional resources