Troubleshoot errors in Snowflake Notebooks

The following scenarios can help you troubleshoot issues that can occur when using Snowflake Notebooks.

Unable to connect due to firewall

The following popup occurs when you try to start your notebook:

Something went wrong. Unable to connect. A firewall or ad blocker might be preventing you from connecting.

Ensure that *.snowflake.app is on the allowlist in your network and can connect to Snowflake. When this domain is on the allowlist, your apps can communicate with Snowflake servers without any restrictions.

No active warehouse selected

To resolve this error, specify a warehouse for the session with the USE WAREHOUSE command or select a warehouse in your notebook. For steps on how to select a warehouse for your notebook, see Warehouse recommendations for running Snowflake Notebooks.

Additionally, you’ll see this error if you’re using a role that doesn’t have privileges to access the warehouse, database, or schema that the notebook is using. You need to switch to a role that has access to these resources, so that you can continue your work.

Missing packages

The following message occurs in a cell output if you’re trying to use a package that is not installed in your notebook environment:

ModuleNotFoundError: Line 2: Module Not Found: snowflake.core. To import packages from Anaconda, install them first using the package selector at the top of the page.

Import the neccessary package by following the instructions on the Import Python packages to use in notebooks page.

Missing package from existing notebook

New versions of notebooks are continually being released and notebooks are auto-upgraded to the latest version. Sometimes, when upgrading an old notebook, the packages in the notebook environment aren’t compatible with the upgrade. This could possibly cause the notebook to fail to start.

The following is an example of an error message when the Libpython package is missing:

SnowflakeInternalException{signature=std::vector<sf::RuntimePathLinkage> sf::{anonymous}::buildRuntimeFileSet(const sf::UdfRuntime&, std::string_view, const std::vector<sf::udf::ThirdPartyLibrariesInfo>&, bool):"libpython_missing", internalMsg=[XP_WORKER_FAILURE: Unexpected error signaled by function 'std::vector<sf::RuntimePathLinkage> sf::{anonymous}::buildRuntimeFileSet(const sf::UdfRuntime&, std::string_view, const std::vector<sf::udf::ThirdPartyLibrariesInfo>&, bool)'
Assert "libpython_missing"[{"function": "std::vector<sf::RuntimePathLinkage> sf::{anonymous}::buildRuntimeFileSet(const sf::UdfRuntime&, std::string_view, const std::vector<sf::udf::ThirdPartyLibrariesInfo>&, bool)", "line": 1307, "stack frame ptr": "0xf2ff65553120",  "libPythonOnHost": "/opt/sfc/deployments/prod1/ExecPlatform/cache/directory_cache/server_2921757878/v3/python_udf_libs/.data/4e8f2a35e2a60eb4cce3538d6f794bd7881d238d64b1b3e28c72c0f3d58843f0/lib/libpython3.9.so.1.0"}]], userMsg=Processing aborted due to error 300010:791225565; incident 9770775., reporter=unknown, dumpFile= file://, isAborting=true, isVerbose=false}

To resolve this error, try the following steps:

  • Refresh the webpage and start the notebook again.

  • If the issue persists, the next step is to open the package picker and check if all installed packages are still valid. In the drop down for each package, you can see the available versions. Selecting the latest version of the package usually clears the error.

Read-only file system issue

Some Python libraries download or cache data to a local user directory. However, the default user directory /home/udf is read-only. To work around this, set the path as /tmp which is a writable location. Note that the environment variable used to set the write directory may vary depending on which library you are using. The following is a list of known libraries that present this issue:

  • matplotlib

  • HuggingFace

  • catboost

matplotlib example

The following is the warning you get when you try to use matplotlib:

Matplotlib created a temporary cache directory at /tmp/matplotlib-2fk8582w because the default path (/home/udf/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.

The following code sets the MPLCONFIGDIR variable to /tmp/ to resolve this warning:

import os
os.environ["MPLCONFIGDIR"] = '/tmp/'
import matplotlib.pyplot as plt
Copy

Huggingface example

The following is the warning returned when you try to use Huggingface:

Readonly file system: `/home/udf/.cache`

The following code sets the HF_HOME and SENTENCE_TRANSFORMERS_HOME variables to /tmp/ to get rid of this error:

import os
os.environ['HF_HOME'] = '/tmp'
os.environ['SENTENCE_TRANSFORMERS_HOME'] = '/tmp'

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Snowflake/snowflake-arctic-embed-xs")
Copy

Output message is too large when using df.collect()

The following message is displayed in the cell output when you run df.collect():

MessageSizeError: Data of size 522.0 MB exceeds the message size limit of 200.0 MB.
This is often caused by a large chart or dataframe. Please decrease the amount of data sent to the browser,
or increase the limit by setting the config option server.maxMessageSize.
Click here to learn more about config options.
Note that increasing the limit may lead to long loading times and large memory consumption of the client's browser and the Streamlit server.

Snowflake Notebooks automatically truncates results in the cell output for large datasets in following cases:

  • All SQL cell results.

  • Python cell results if it’s a snowpark.Dataframe.

The issue with the above cell is that df.collect() returns a List instead of snowpark.Dataframe. Lists are not automatically truncated. To get around this issue, directly output the results of the Dataframe.

df
Copy

Additionally, for large datasets, avoid using df.to_pandas(), because all the data is loaded into memory and may result in an out of memory error as well.