# Using the Snowflake SQLAlchemy Toolkit with the Python Connector¶

Snowflake SQLAlchemy runs on the top of the Snowflake Connector for Python as a dialect to bridge a Snowflake database and SQLAlchemy applications.

In this Topic:

## Prerequisites¶

### Snowflake Connector for Python¶

The only requirement for Snowflake SQLAlchemy is the Snowflake Connector for Python; however, the connector does not need to be installed because installing Snowflake SQLAlchemy automatically installs the connector.

### Data Analytics and Web Application Frameworks (Optional)¶

Snowflake SQLAlchemy can be used with Pandas, Jupyter and Pyramid, which provide higher levels of application frameworks for data analytics and web applications. However, building a working environment from scratch is not a trivial task, particularly for novice users. Installing the frameworks requires C compilers and tools, and choosing the right tools and versions is a hurdle that might deter users from using Python applications.

An easier way to build an environment is through Anaconda, which provides a complete, precompiled technology stack for all users, including non-Python experts such as data analysts and students. For Anaconda installation instructions, see the Anaconda install documentation. The Snowflake SQLAlchemy package can then be installed on top of Anaconda using pip.

## Installing Snowflake SQLAlchemy¶

The Snowflake SQLAlchemy package can be installed from the public PyPI repository using pip:

pip install --upgrade snowflake-sqlalchemy


pip automatically installs all required modules, including the Snowflake Connector for Python.

Note that the developer notes are hosted with the source code on GitHub.

## Verifying Your Installation¶

1. Create a file (e.g. validate.py) that contains the following Python sample code, which connects to Snowflake and displays the Snowflake version:

#!/usr/bin/env python
from sqlalchemy import create_engine

engine = create_engine(
account='<account_name>',
)
)
try:
connection = engine.connect()
results = connection.execute('select current_version()').fetchone()
print(results[0])
finally:
connection.close()
engine.dispose()

2. Replace <user_login_name>, <password>, and <account_name> with the appropriate values for your Snowflake account and user. For more details, see Connection Parameters (in this topic).

3. Execute the sample code. For example, if you created a file named validate.py:

python validate.py


The Snowflake version (e.g. 1.48.0) should be displayed.

## Snowflake-specific Parameters and Behavior¶

As much as possible, Snowflake SQLAlchemy provides compatible functionality for SQLAlchemy applications. For information on using SQLAlchemy, see the SQLAlchemy documentation.

However, Snowflake SQLAlchemy also provides Snowflake-specific parameters and behavior, which are described in the following sections.

### Connection Parameters¶

#### Required Parameters¶

Snowflake SQLAlchemy uses the following connection string syntax to connect to Snowflake and initiate a session:

'snowflake://<user_login_name>:<password>@<account_name>'


Where:

• <user_login_name> is the login name for your Snowflake user.

• <password> is the password for your Snowflake user.

• <account_name> is the full name of your account (provided by Snowflake).

Note that your full account name may include additional segments that identify the region and cloud platform where your account is hosted.

Account name examples by region

If your account locator is xy12345:

Cloud Platform / . Region

Account Locator . (with additional segments if required)

Amazon Web Services (AWS)

US West (Oregon)

xy12345

US East (Ohio)

xy12345.us-east-2.aws

US East (N. Virginia)

xy12345.us-east-1

US East (Commercial Gov - N. Virginia)

xy12345.us-east-1-gov.aws

xy12345.ca-central-1.aws

EU (Ireland)

xy12345.eu-west-1

Europe (London)

xy12345.eu-west-2.aws

EU (Frankfurt)

xy12345.eu-central-1

Asia Pacific (Tokyo)

xy12345.ap-northeast-1.aws

Asia Pacific (Mumbai)

xy12345.ap-south-1.aws

Asia Pacific (Singapore)

xy12345.ap-southeast-1

Asia Pacific (Sydney)

xy12345.ap-southeast-2

Google Cloud Platform (GCP)

US Central1 (Iowa)

xy12345.us-central1.gcp

Europe West2 (London)

xy12345.europe-west2.gcp

Europe West4 (Netherlands)

xy12345.europe-west4.gcp

Microsoft Azure

West US 2 (Washington)

xy12345.west-us-2.azure

East US 2 (Virginia)

xy12345.east-us-2.azure

US Gov Virginia

xy12345.us-gov-virginia.azure

xy12345.canada-central.azure

West Europe (Netherlands)

xy12345.west-europe.azure

Switzerland North (Zurich)

xy12345.switzerland-north.azure

Southeast Asia (Singapore)

xy12345.southeast-asia.azure

Australia East (New South Wales)

xy12345.australia-east.azure

Important

If either of the following conditions is true, your account locator is different than the structure described in the above examples:

• If AWS PrivateLink or Azure Private Link is enabled for your account, your account locator requires a privatelink segment in place of the region segment.

For more details, see:

Note

Do not include the Snowflake domain name (snowflakecomputing.com) as part of your account name. Snowflake automatically appends the domain name to your account name to create the required connection.

#### Additional Connection Parameters¶

You can optionally include the following additional information at the end of the connection string (after <account_name>):

'snowflake://<user_login_name>:<password>@<account_name>/<database_name>/<schema_name>?warehouse=<warehouse_name>&role=<role_name>'


Where:

• <database_name> and <schema_name> are the initial database and schema for the Snowflake session, separated by forward slashes (/).

• warehouse=<warehouse_name> and role=<role_name>' are the initial warehouse and role for the session, specified as parameter strings, separated by question marks (?).

Note

After login, the initial database, schema, warehouse, and role specified in the connection string can always be changed for the session.

#### Proxy Server Configuration¶

Proxy server parameters are not supported. Instead, use the supported environment variables to configure a proxy server. For information, see Using a Proxy Server.

#### Connection String Examples¶

The following example calls the create_engine method with the user name testuser1, password 0123456, account name xy12345.us-east-1, database testdb, schema public, warehouse testwh, and role myrole:

from sqlalchemy import create_engine
engine = create_engine(
'snowflake://testuser1:0123456@xy12345.us-east-1/testdb/public?warehouse=testwh&role=myrole'
)


For convenience, you can use the snowflake.sqlalchemy.URL method to construct the connection string and connect to the database. The following example constructs the same connection string from the previous example:

from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine

engine = create_engine(URL(
account = 'xy12345.us-east-1',
user = 'testuser1',
database = 'testdb',
schema = 'public',
warehouse = 'testwh',
role='myrole',
))


### Opening and Closing a Connection¶

Open a connection by executing engine.connect(); avoid using engine.execute().

# Avoid this.
engine = create_engine(...)
engine.execute(<SQL>)
engine.dispose()

# Do this.
engine = create_engine(...)
connection = engine.connect()
try:
connection.execute(<SQL>)
finally:
connection.close()
engine.dispose()


Note

Make certain to close the connection by executing connection.close() before engine.dispose(); otherwise, the Python Garbage collector removes the resources required to communicate with Snowflake, preventing the Python connector from closing the session properly.

### Auto-increment Behavior¶

Auto-incrementing a value requires the Sequence object. Include the Sequence object in the primary key column to automatically increment the value as each new record is inserted. For example:

t = Table('mytable', metadata,
Column('id', Integer, Sequence('id_seq'), primary_key=True),
Column(...), ...
)


### Object Name Case Handling¶

Snowflake stores all case-insensitive object names in uppercase text. In contrast, SQLAlchemy considers all lowercase object names to be case-insensitive. Snowflake SQLAlchemy converts the object name case during schema-level communication (i.e. during table and index reflection). If you use uppercase object names, SQLAlchemy assumes they are case-sensitive and encloses the names with quotes. This behavior will cause mismatches against data dictionary data received from Snowflake, so unless identifier names have been truly created as case sensitive using quotes (e.g. "TestDb"), all lowercase names should be used on the SQLAlchemy side.

### Index Support¶

Snowflake does not utilize indexes, so neither does Snowflake SQLAlchemy.

### Numpy Data Type Support¶

Snowflake SQLAlchemy supports binding and fetching NumPy data types. Binding is always supported. To enable fetching NumPy data types, add numpy=True to the connection parameters.

The following NumPy data types are supported:

• numpy.int64

• numpy.float64

• numpy.datetime64

The following example shows the round trip of numpy.datetime64 data:

import numpy as np
import pandas as pd
engine = create_engine(URL(
account = 'xy12345',
user = 'testuser1',
database = 'db',
schema = 'public',
warehouse = 'testwh',
role='myrole',
numpy=True,
))

specific_date = np.datetime64('2016-03-04T12:03:05.123456789Z')

connection = engine.connect()
connection.execute(
"CREATE OR REPLACE TABLE ts_tbl(c1 TIMESTAMP_NTZ)")
connection.execute(
"INSERT INTO ts_tbl(c1) values(%s)", (specific_date,)
)
df = pd.read_sql_query("SELECT * FROM ts_tbl", engine)
assert df.c1.values[0] == specific_date


### Cache Column Metadata¶

SQLAlchemy provides the runtime inspection API to get the runtime information about the various objects. One of the common use case is get all tables and their column metadata in a schema in order to construct a schema catalog. For example, alembic on top of SQLAlchemy manages database schema migrations. A pseudo code flow is as follows:

inspector = inspect(engine)
schema = inspector.default_schema_name
for table_name in inspector.get_table_names(schema):
column_metadata = inspector.get_columns(table_name, schema)
primary_keys = inspector.get_primary_keys(table_name, schema)
foreign_keys = inspector.get_foreign_keys(table_name, schema)
...


In this flow, a potential problem is it may take quite a while as queries run on each table. The results are cached but getting column metadata is expensive.

To mitigate the problem, Snowflake SQLAlchemy takes a flag cache_column_metadata=True such that all of column metadata for all tables are cached when get_table_names is called and the rest of get_columns, get_primary_keys and get_foreign_keys can take advantage of the cache.

engine = create_engine(URL(
account = 'xy12345',
user = 'testuser1',
database = 'db',
schema = 'public',
warehouse = 'testwh',
role='myrole',
))


Note

Memory usage will go up higher as all of column metadata are cached associated with Inspector object. Use the flag only if you need to get all of column metadata.

### VARIANT, ARRAY, and OBJECT Support¶

Snowflake SQLAlchemy supports fetching VARIANT, ARRAY and OBJECT data types. All types are converted into str in Python so that you can convert them to native data types using json.loads.

This example shows how to create a table including VARIANT, ARRAY, and OBJECT data type columns:

from snowflake.sqlalchemy import (VARIANT, ARRAY, OBJECT)
...
t = Table('my_semi_structured_datatype_table', metadata,
Column('va', VARIANT),
Column('ob', OBJECT),
Column('ar', ARRAY))
metdata.create_all(engine)


In order to retrieve VARIANT, ARRAY, and OBJECT data type columns and convert them to the native Python data types, fetch data and call the json.loads method as follows:

import json
connection = engine.connect()
results = connection.execute(select([t]))
row = results.fetchone()


### CLUSTER BY Support¶

Snowflake SQLAchemy supports the CLUSTER BY parameter for tables. For information about the parameter, see CREATE TABLE.

This example shows how to create a table with two columns, id and name, as the clustering key:

t = Table('myuser', metadata,
Column('id', Integer, primary_key=True),
Column('name', String),
snowflake_clusterby=['id', 'name'], ...
)

Alembic is a database migration tool on top of SQLAlchemy. Snowflake SQLAlchemy works by adding the following code to alembic/env.py so that Alembic can recognize Snowflake SQLAlchemy.
from alembic.ddl.impl import DefaultImpl