Setting Up Your Development Environment for Snowpark Python¶
Set up your preferred local development environment to build client applications with Snowpark Python.
If you are writing a stored procedure with Snowpark Python, consider setting up a Python worksheet instead.
The Snowpark API requires Python 3.8.
You can create a Python 3.8 virtual environment using tools like Anaconda, Miniconda, or virtualenv.
For example, to use conda to create a Python 3.8 virtual environment, add the Snowflake conda channel, and install the numpy and pandas packages, type:
conda create --name py38_env --override-channels -c https://repo.anaconda.com/pkgs/snowflake python=3.8 numpy pandas
Creating a new conda environment locally with the Snowflake channel is recommended in order to have the best experience when using UDFs. For more information, see Local Development and Testing.
There is a known issue with running Snowpark Python on Apple M1 chips due to memory handling in pyOpenSSL. The error message displayed is, “Cannot allocate write+execute memory for ffi.callback()”.
As a workaround, set up a virtual environment that uses x86 Python using these commands:
CONDA_SUBDIR=osx-64 conda create -n snowpark python=3.8 numpy pandas --override-channels -c https://repo.anaconda.com/pkgs/snowflake conda activate snowpark conda config --env --set subdir osx-64
Then, install Snowpark within this environment as described in the next section.
Prerequisites for Using Pandas DataFrames¶
The Snowpark API provides methods for writing data to and from Pandas DataFrames. Pandas is a library for data analysis. With Pandas, you use a data structure called a DataFrame to analyze and manipulate two-dimensional data.
These methods require the following libraries:
Pandas 1.0.0 (or higher).
PyArrow library version 8.0.0.
If you do not have PyArrow installed, you do not need to install PyArrow yourself; installing Snowpark automatically installs the appropriate version of PyArrow.
If you have already installed any version of the PyArrow library other than the recommended version listed above, uninstall PyArrow before installing Snowpark.
Do not re-install a different version of PyArrow after installing Snowpark.
Before running the commands in this section, make sure you are in a Python 3.8 environment.
You can check this by typing the command
python -V. If the version displayed is not
Python 3.8, refer to the previous section.
Install the Snowpark Python package into the Python 3.8 virtual environment by using
conda install snowflake-snowpark-python
pip install snowflake-snowpark-python
Optionally, specify packages that you want to install in the environment such as, for example, the Pandas data analysis package:
conda install snowflake-snowpark-python pandas
pip install "snowflake-snowpark-python[pandas]"
You can view the Snowpark Python project description on the Python Package Index (PyPi) repository.
Setting Up a Jupyter Notebook for Snowpark¶
To get started using Snowpark with Jupyter Notebooks, do the following:
Install Jupyter Notebooks:
pip install notebook
Start a Jupyter Notebook:
In the top-right corner of the web page that opened, select New » Python 3 Notebook.
In a cell, create a session. For more information, see Creating a Session.
Setting Up an IDE for Snowpark¶
You can use Snowpark with an integrated development environment (IDE).
To use Snowpark with Microsoft Visual Studio Code, install the Python extension and then specify the Python environment to use.
You must manually select the Python 3.8 environment that you created when you set up your development environment.
To do this, use the
Python: Select Interpreter command from the
For more information, see Using Python environments in VS Code
in the Microsoft Visual Studio documentation.
The main classes for the Snowpark API are in the
To import particular names from a module, specify the names. For example:
>>> from snowflake.snowpark.functions import avg