Developing Applications in Snowflake¶
To begin developing applications in Snowflake, you need to first become familiar with some Snowflake basics. This topic will guide you through the foundational tasks required for developing applications in Snowflake. In this topic, you will learn how to:
Install and configure the SnowSQL CLI client.
Create a virtual warehouse, database, and table.
Load data into the table and query the table.
Use Python to connect to Snowflake, perform queries, and manipulate the returned data.
You can then use this knowledge to explore the more advanced features and capabilities available in Snowflake.
In this Topic:
Download and Install the SnowSQL CLI Client¶
The first step is to download the SnowSQL CLI client from the Snowflake Client Repository. The instructions for downloading and installing SnowSQL are OS-specific:
You can use
curlto download the Linux version of the client from the Snowflake Client Repository:
curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/linux_x86_64/snowsql-<version>-linux_x86_64.bash
After download, install the client via
bashin the terminal window:
Alternatively, you can download the RPM version of the client (
snowflake-snowsql-<version>-1.x86_64.rpm) and install it:
rpm -i snowflake-snowsql-<version>-1.x86_64.rpm
You can use
curlto download the macOS version of the client from the Snowflake Client Repository:
curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/darwin_x86_64/snowsql-<version>-darwin_x86_64.pkg
After download, install the client by double-clicking the
Alternately, if you have Homebrew, you can install the client through Homebrew Cask:
brew install --cask snowflake-snowsql
curlto download the Windows version of the client from the Snowflake Client Repository:
curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/windows_x86_64/snowsql-<version>-windows_x86_64.msi
Then, run the MSI file to install it.
After installation, verify that the client was successfully installed by running the
snowsql command in your terminal window:
The output displays all the available parameters you can use with the client.
Create a Virtual Warehouse, Sample Database, and Table Using SnowSQL¶
Log into Snowflake¶
Once SnowSQL is installed, you can log into Snowflake. To do so, you’ll need the following information:
<account_name>: This is the name assigned to your account by Snowflake. It’s the string of characters that precede
.snowflakecomputing.comin the URL you received from Snowflake.
<username>(and accompanying password): This is the Snowflake user that was created for you in your account.
To log in, open a terminal window and enter the following command:
snowsql -a <account_name> -u <username>
You are prompted for a password. If your login attempt is successful, a Snowflake session is initiated for your user and the SnowSQL login and prompt
are displayed. For example, if you installed SnowSQL 1.2.9 and your Snowflake username is
user1, the login and prompt would be:
* SnowSQL * v1.2.9 Type SQL statements or !help user1#(no warehouse)@(no database).(no schema)>
Note that the SnowSQL prompt indicates there is no virtual warehouse, database, or schema currently in use in your session.
The next sections describe how to create these required Snowflake objects and use them in the session.
Create a Virtual Warehouse¶
Virtual warehouses provide the compute resources required to perform queries and DML operations, such as data loading, on tables in Snowflake. Virtual warehouses come in different sizes to enable fitting each warehouse to match your querying and data loading workloads.
In this example, use the CREATE WAREHOUSE command to create an X-Small warehouse named
CREATE OR REPLACE WAREHOUSE sf_tuts_wh WITH WAREHOUSE_SIZE='X-SMALL' AUTO_SUSPEND = 180 AUTO_RESUME = TRUE INITIALLY_SUSPENDED=TRUE;
The only required parameter is the name of the warehouse; however, this example sets some additional, optional parameters:
AUTO_SUSPEND: Specifies the number of seconds of inactivity after which a warehouse is automatically suspended.
AUTO_RESUME: If set to TRUE, a suspended warehouse automatically resumes when a query is submitted to it.
INITIALLY_SUSPENDED: If set to TRUE, the warehouse is created in a suspended state.
These additional parameters ensure you don’t use excess credits because the warehouse is created suspended and does not use credits unless running. They also ensure that the warehouse is ready to resume as soon as queries are submitted so that your workflow is uninterrupted.
Object names in Snowflake, also referred to as identifiers, are case in-sensitive (and displayed in all-uppercase) unless explicitly specified otherwise. For more details, see Object Identifiers.
Once created, the virtual warehouse (
sf_tuts_wh) is automatically in use for the session, as shown in your updated SnowSQL prompt:
user1#SF_TUTS_WH@(no database).(no schema)>
To view the warehouse in use for a session, you can also use the CURRENT_WAREHOUSE function:
To explicitly specify the warehouse to use in the session, use the USE WAREHOUSE command:
USE WAREHOUSE sf_tuts_wh;
You’ve now successfully created a virtual warehouse. Let’s move on to databases and tables.
Create a Database¶
Use the CREATE DATABASE command to create databases. In this example, you create a database named
CREATE OR REPLACE DATABASE sf_tuts;
When you create a database in Snowflake, a default schema named
public is automatically created in the database. Both the database and schema
are automatically in use in the session, as shown in your updated SnowSQL prompt:
Create a Table¶
To begin working with data stored in the database, you must first create a table for the data. This can be done using the CREATE TABLE command.
The column names and data types in your table should match the data to be loaded. For the purposes of this topic, the data to be loaded is employee information provided in sample CSV files located here for download.
Download the ZIP file and then extract the CSV files to any location in your environment. In the subsequent examples in this topic, the location is
assumed to be a directory named
The following CREATE TABLE example illustrates the column names and data types that match data to be loaded. The table column names match the columns from the sample CSV files:
CREATE OR REPLACE TABLE emp_basic ( first_name STRING , last_name STRING , email STRING , streetaddress STRING , city STRING , start_date DATE );
Load and Query Data Using SnowSQL¶
This section describes how to use SQL commands and SnowSQL to bulk load data; however, Snowflake supports many other options for loading data.
See the sidebar (on the right) for links to other topics related to data loading.
Upload Data File(s)¶
Now load the table with data from one or more files. First, use the PUT command to upload the files to an internal stage.
The stage is where files are stored before loading. You can use wildcards (e.g.
*) to upload multiple files.
For example, to upload the example CSV files to the internal stage for the
emp_basic table you created earlier in the
- Linux or macOS
PUT file:///tmp/employees0*.csv @sf_tuts.public.%emp_basic;
PUT file://C:\temp\employees0*.csv @sf_tuts.public.%emp_basic;
@ sign before the database and schema name (
@sf_tuts.public) indicates that the files are being uploaded to an internal stage, rather
than an external stage. The
% sign before the table name (
%emp_basic) indicates that the internal stage being used is the stage for the
table. For more details about stages, see Staging Data Files from a Local File System.
Load Data from the File(s)¶
Once the file(s) are staged, you can use the COPY INTO <table> command to load the data from the files into the table you created:
COPY INTO emp_basic FROM @sf_tuts.public.%emp_basic FILE_FORMAT = (type = csv field_optionally_enclosed_by='"') PATTERN = '.*employees0[1-5].csv.gz' ON_ERROR = 'skip_file';
FILE_FORMAT specifies the file type (e.g.
csv), and how the fields are enclosed. In this example, the fields are enclosed by double quotation
If you are loading multiple files, the following optional parameters may be useful:
PATTERN: Applies a pattern to specify files that match a regular expression.
ON_ERROR: Specifies what happens when an error is encountered while loading a file.
Query the Table¶
With the data loaded into the table, you can now execute queries using SQL SELECT statements.
First, to return all the columns and rows in the table:
SELECT * FROM emp_basic;
A SELECT statement can include functions, such as LIKE, to retrieve data that meet certain qualifications. In this
example, the command returns the email addresses with a United Kingdom domain name (
SELECT email FROM emp_basic WHERE email LIKE '%.uk';
Update the Table¶
You can also use DML commands to directly manipulate the table. In this example, use the INSERT command to insert a single row into the table:
INSERT INTO emp_basic VALUES ('Clementine','Adamou','cadamou@sf_tuts.com','10510 Sachs Road','Klenak','2017-9-22');
Snowflake supports many other commands that you can use to interact with your data. For now, though, let’s move on and look at how to connect a Python application to Snowflake to harness the power of Snowflake’s supported programming languages.
Use Python with Snowflake¶
This section provides some basic examples for getting started using Python; however, Snowflake provides interfaces for many of the most popular programming languages.
See the sidebar (on the right) for links to other topics related to Snowflake’s supported programmatic interfaces.
Install the Snowflake Connector for Python¶
To connect to Snowflake using Python, Python 3.6 (or higher) and the Snowflake Connector for Python must be installed.
First, check to see the version of Python installed in your environment by running this command in a terminal window:
If your version is Python 3.6 or higher, you’re ready to proceed. If your version is lower or Python is not installed, you need to update or install it:
- Linux or macOS
Your environment distribution most likely has Python 3.5 or higher available to be installed via its package manager. You may also have to install
pip, the package management system used to install Python packages. It can also be downloaded from most distributions’ package managers.
You can download Python via the Python Software Foundation website.
With Python and
pip installed, use
pip to install the package for the Python connector via the command line:
pip install snowflake-connector-python
You can now connect to Snowflake using Python to query data and perform other operations.
Before you begin, you should consider adding your SnowSQL credentials as environment variables so they aren’t exposed in your scripts. For more details, see Connecting Through SnowSQL.
Import the Connector¶
In your Python application, first import the connector package (
snowflake.connector). If you’re using environment variables, you also need to
import snowflake.connector import os
Connect to Snowflake¶
Next, you connect to Snowflake with your credentials. If you’ve set up your credentials as environment variables, reference them here. This is also where you specify the virtual warehouse, database, and schema to use in your Snowflake session:
USER = os.getenv('SNOWSQL_USER') PASSWORD = os.getenv('SNOWSQL_PWD') ACCOUNT = os.getenv('SNOWSQL_ACCOUNT') ctx = snowflake.connector.connect( user=USER, password=PASSWORD, account=ACCOUNT, warehouse='SF_TUTS_WH', database='SF_TUTS', schema='PUBLIC' )
Define a Query¶
You can use commands to specify the data you need for your application. In this example, you concatenate multiple strings into a single query:
command = "SELECT start_date FROM emp_basic " command += "WHERE start_date >= '2016-12-31' " command += "ORDER by start_date;"
Execute the Query and Return the Results¶
The SQL commands are executed via the cursor object iterator method. The dates will then be appended into an empty Python list:
dates =  cs = ctx.cursor() try: cs.execute(command) for start_date in cs: dates.append(start_date) finally: cs.close()
Reading the data directly into a Python program lets you use Python packages to process the data. For example, the
matplotlib.pyplot visualization packages
can be used to visualize the data:
import matplotlib.pyplot import seaborn as sns months =  for i in dates: months.append(i.strftime('%m')) sns.distplot( months, hist = True, axlabel = "Month", label = "Employee Growth in 2017", bins = 12, kde=False) matplotlib.pyplot.show()
The resulting graph would look similar to the following:
For more details about what you can do with Python and Snowflake, see the Python documentation.
After completing the tasks in this topic, you should be familiar with the basics of using SnowSQL and the Snowflake Connector for Python.
Drop the Sample Database and Virtual Warehouse (Optional)¶
If you don’t plan on using the sample database and warehouse created in this topic, you should consider dropping them:
DROP DATABASE IF EXISTS sf_tuts; DROP WAREHOUSE IF EXISTS sf_tuts_wh;
Advanced Application Features¶
As you further explore Snowflake’s capabilities, consider the following features, which are available for developing advanced applications:
Semi-structured data (JSON, Avro, Parquet, etc.)
Continuous Data Pipelines
Functions and Procedures