Developing Applications in Snowflake

To begin developing applications in Snowflake, you need to first become familiar with some Snowflake basics. This topic will guide you through the foundational tasks required for developing applications in Snowflake. In this topic, you will learn how to:

  • Install and configure the SnowSQL CLI client.

  • Create a virtual warehouse, database, and table.

  • Load data into the table and query the table.

  • Use Python to connect to Snowflake, perform queries, and manipulate the returned data.

You can then use this knowledge to explore the more advanced features and capabilities available in Snowflake.

In this Topic:

Download and Install the SnowSQL CLI Client

The first step is to download the SnowSQL CLI client from the Snowflake Client Repository. The instructions for downloading and installing SnowSQL are OS-specific:

Linux

You can use curl to download the Linux version of the client from the Snowflake Client Repository:

curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/linux_x86_64/snowsql-<version>-linux_x86_64.bash

After download, install the client via bash in the terminal window:

bash snowsql-<version>-linux_x86_64.bash

Alternatively, you can download the RPM version of the client (snowflake-snowsql-<version>-1.x86_64.rpm) and install it:

rpm -i snowflake-snowsql-<version>-1.x86_64.rpm
macOS

You can use curl to download the macOS version of the client from the Snowflake Client Repository:

curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/darwin_x86_64/snowsql-<version>-darwin_x86_64.pkg

After download, install the client by double-clicking the snowsql-<version>-darwin_x86_64.pkg file.

Alternately, if you have Homebrew, you can install the client through Homebrew Cask:

brew cask install snowflake-snowsql
Windows

Use curl to download the Windows version of the client from the Snowflake Client Repository:

curl -O https://sfc-repo.snowflakecomputing.com/snowsql/bootstrap/1.2/windows_x86_64/snowsql-<version>-windows_x86_64.msi

Then, run the MSI file to install it.

After installation, verify that the client was successfully installed by running the snowsql command in your terminal window:

snowsql

The output displays all the available parameters you can use with the client.

Create a Virtual Warehouse, Sample Database, and Table Using SnowSQL

Log into Snowflake

Once SnowSQL is installed, you can log into Snowflake. To do so, you’ll need the following information:

  • <account_name>: This is the name assigned to your account by Snowflake. It’s the string of characters that precede .snowflakecomputing.com in the URL you received from Snowflake.

  • <username> (and accompanying password): This is the Snowflake user that was created for you in your account.

To log in, open a terminal window and enter the following command:

snowsql -a <account_name> -u <username>

You are prompted for a password. If your login attempt is successful, a Snowflake session is initiated for your user and the SnowSQL login and prompt are displayed. For example, if you installed SnowSQL 1.2.9 and your Snowflake username is user1, the login and prompt would be:

* SnowSQL * v1.2.9
Type SQL statements or !help
user1#(no warehouse)@(no database).(no schema)>

Note that the SnowSQL prompt indicates there is no virtual warehouse, database, or schema currently in use in your session.

The next sections describe how to create these required Snowflake objects and use them in the session.

Create a Virtual Warehouse

Virtual warehouses provide the compute resources required to perform queries and DML operations, such as data loading, on tables in Snowflake. Virtual warehouses come in different sizes to enable fitting each warehouse to match your querying and data loading workloads.

In this example, use the CREATE WAREHOUSE command to create an X-Small warehouse named sf_tuts_wh:

CREATE OR REPLACE WAREHOUSE sf_tuts_wh WITH
  WAREHOUSE_SIZE='X-SMALL'
  AUTO_SUSPEND = 180
  AUTO_RESUME = TRUE
  INITIALLY_SUSPENDED=TRUE;

The only required parameter is the name of the warehouse; however, this example sets some additional, optional parameters:

  • AUTO_SUSPEND: Specifies the number of seconds of inactivity after which a warehouse is automatically suspended.

  • AUTO_RESUME: If set to TRUE, a suspended warehouse automatically resumes when a query is submitted to it.

  • INITIALLY_SUSPENDED: If set to TRUE, the warehouse is created in a suspended state.

These additional parameters ensure you don’t use excess credits because the warehouse is created suspended and does not use credits unless running. They also ensure that the warehouse is ready to resume as soon as queries are submitted so that your workflow is uninterrupted.

Tip

Object names in Snowflake, also referred to as identifiers, are case in-sensitive (and displayed in all-uppercase) unless explicitly specified otherwise. For more details, see Object Identifiers.

Once created, the virtual warehouse (sf_tuts_wh) is automatically in use for the session, as shown in your updated SnowSQL prompt:

user1#SF_TUTS_WH@(no database).(no schema)>

To view the warehouse in use for a session, you can also use the CURRENT_WAREHOUSE function:

SELECT CURRENT_WAREHOUSE();

To explicitly specify the warehouse to use in the session, use the USE WAREHOUSE command:

USE WAREHOUSE sf_tuts_wh;

You’ve now successfully created a virtual warehouse. Let’s move on to databases and tables.

Create a Database

Use the CREATE DATABASE command to create databases. In this example, you create a database named sf_tuts:

CREATE OR REPLACE DATABASE sf_tuts;

When you create a database in Snowflake, a default schema named public is automatically created in the database. Both the database and schema are automatically in use in the session, as shown in your updated SnowSQL prompt:

user1#SF_TUTS_WH@SF_TUTS.PUBLIC>

See also:

Create a Table

To begin working with data stored in the database, you must first create a table for the data. This can be done using the CREATE TABLE command.

The column names and data types in your table should match the data to be loaded. For the purposes of this topic, the data to be loaded is employee information provided in sample CSV files located here for download.

Download the ZIP file and then extract the CSV files to any location in your environment. In the subsequent examples in this topic, the location is assumed to be a directory named tmp.

The following CREATE TABLE example illustrates the column names and data types that match data to be loaded. The table column names match the columns from the sample CSV files:

CREATE OR REPLACE TABLE emp_basic (
  first_name STRING ,
  last_name STRING ,
  email STRING ,
  streetaddress STRING ,
  city STRING ,
  start_date DATE
  );

Load and Query Data Using SnowSQL

This section describes how to use SQL commands and SnowSQL to bulk load data; however, Snowflake supports many other options for loading data.

See the sidebar (on the right) for links to other topics related to data loading.

Upload Data File(s)

Now load the table with data from one or more files. First, use the PUT command to upload the files to an internal stage. The stage is where files are stored before loading. You can use wildcards (e.g. *) to upload multiple files.

For example, to upload the example CSV files to the internal stage for the emp_basic table you created earlier in the sf_tuts.public schema:

Linux or macOS
PUT file:///tmp/employees0*.csv @sf_tuts.public.%emp_basic;
Windows
PUT file://C:\temp\employees0*.csv @sf_tuts.public.%emp_basic;

Tip

The @ sign before the database and schema name (@sf_tuts.public) indicates that the files are being uploaded to an internal stage, rather than an external stage. The % sign before the table name (%emp_basic) indicates that the internal stage being used is the stage for the table. For more details about stages, see Staging Data Files from a Local File System.

Load Data from the File(s)

Once the file(s) are staged, you can use the COPY INTO <table> command to load the data from the files into the table you created:

COPY INTO emp_basic
  FROM @sf_tuts.public.%emp_basic
  FILE_FORMAT = (type = csv field_optionally_enclosed_by='"')
  PATTERN = '.*employees0[1-5].csv.gz'
  ON_ERROR = 'skip_file';

FILE_FORMAT specifies the file type (e.g. csv), and how the fields are enclosed. In this example, the fields are enclosed by double quotation marks.

Tip

If you are loading multiple files, the following optional parameters may be useful:

  • PATTERN: Applies a pattern to specify files that match a regular expression.

  • ON_ERROR: Specifies what happens when an error is encountered while loading a file.

Query the Table

With the data loaded into the table, you can now execute queries using SQL SELECT statements.

First, to return all the columns and rows in the table:

SELECT * FROM emp_basic;

A SELECT statement can include functions, such as LIKE, to retrieve data that meet certain qualifications. In this example, the command returns the email addresses with a United Kingdom domain name (.uk):

SELECT email FROM emp_basic WHERE email LIKE '%.uk';

Update the Table

You can also use DML commands to directly manipulate the table. In this example, use the INSERT command to insert a single row into the table:

INSERT INTO emp_basic VALUES
  ('Clementine','Adamou','cadamou@sf_tuts.com','10510 Sachs Road','Klenak','2017-9-22');

Snowflake supports many other commands that you can use to interact with your data. For now, though, let’s move on and look at how to connect a Python application to Snowflake to harness the power of Snowflake’s supported programming languages.

Use Python with Snowflake

This section provides some basic examples for getting started using Python; however, Snowflake provides interfaces for many of the most popular programming languages.

See the sidebar (on the right) for links to other topics related to Snowflake’s supported programmatic interfaces.

Install the Snowflake Connector for Python

To connect to Snowflake using Python, Python 3.6 (or higher) and the Snowflake Connector for Python must be installed.

First, check to see the version of Python installed in your environment by running this command in a terminal window:

python --version

If your version is Python 3.6 or higher, you’re ready to proceed. If your version is lower or Python is not installed, you need to update or install it:

Linux or macOS

Your environment distribution most likely has Python 3.5 or higher available to be installed via its package manager. You may also have to install pip, the package management system used to install Python packages. It can also be downloaded from most distributions’ package managers.

Windows

You can download Python via the Python Software Foundation website.

With Python and pip installed, use pip to install the package for the Python connector via the command line:

pip install snowflake-connector-python

You can now connect to Snowflake using Python to query data and perform other operations.

Tip

Before you begin, you should consider adding your SnowSQL credentials as environment variables so they aren’t exposed in your scripts. For more details, see Connecting Through SnowSQL.

Import the Connector

In your Python application, first import the connector package (snowflake.connector). If you’re using environment variables, you also need to import the os package:

import snowflake.connector
import os

Connect to Snowflake

Next, you connect to Snowflake with your credentials. If you’ve set up your credentials as environment variables, reference them here. This is also where you specify the virtual warehouse, database, and schema to use in your Snowflake session:

USER = os.getenv('SNOWSQL_USER')
PASSWORD = os.getenv('SNOWSQL_PWD')
ACCOUNT = os.getenv('SNOWSQL_ACCOUNT')

ctx = snowflake.connector.connect(
  user=USER,
  password=PASSWORD,
  account=ACCOUNT,
  warehouse='SF_TUTS_WH',
  database='SF_TUTS',
  schema='PUBLIC'
  )

Define a Query

You can use commands to specify the data you need for your application. In this example, you concatenate multiple strings into a single query:

command = "SELECT start_date FROM emp_basic "
command += "WHERE start_date >= '2016-12-31' "
command += "ORDER by start_date;"

Execute the Query and Return the Results

The SQL commands are executed via the cursor object iterator method. The dates will then be appended into an empty Python list:

dates = []
cs = ctx.cursor()
try:
    cs.execute(command)
    for start_date in cs:
        dates.append(start_date)
finally:
    cs.close()

Reading the data directly into a Python program lets you use Python packages to process the data. For example, the seaborn and matplotlib.pyplot visualization packages can be used to visualize the data:

import matplotlib.pyplot
import seaborn as sns

months = []

for i in dates:
    months.append(i.strftime('%m'))

sns.distplot(
    months,
    hist = True,
    axlabel = "Month",
    label = "Employee Growth in 2017",
    bins = 12,
    kde=False)

matplotlib.pyplot.show()

The resulting graph would look similar to the following:

Example of visualization package showing employee growth

For more details about what you can do with Python and Snowflake, see the Python documentation.

Next Steps

After completing the tasks in this topic, you should be familiar with the basics of using SnowSQL and the Snowflake Connector for Python.

Drop the Sample Database and Virtual Warehouse (Optional)

If you don’t plan on using the sample database and warehouse created in this topic, you should consider dropping them:

DROP DATABASE IF EXISTS sf_tuts;

DROP WAREHOUSE IF EXISTS sf_tuts_wh;

Advanced Application Features

As you further explore Snowflake’s capabilities, consider the following features, which are available for developing advanced applications:

Functional Area

Feature

Semi-structured data (JSON, Avro, Parquet, etc.)

Semi-structured Data

Data Lakes

Working with External Tables

Time Travel

Understanding & Using Time Travel

Continuous Data Pipelines

Change Tracking Using Table Streams . Executing SQL Statements on a Schedule Using Tasks

Variables

SQL Variables

Multi-statement Transactions

Transactions

Functions and Procedures

UDFs (User-Defined Functions) . External Functions . Stored Procedures