Python Connector API

The Snowflake Connector for Python implements the Python Database API v2.0 specification (PEP-249). This topic covers the standard API and the Snowflake-specific extensions.

Module: snowflake.connector

The main module is snowflake.connector, which creates a Connection object and provides Error classes.

Constants

apilevel

String constant stating the supported API level. The connector supports API "2.0".

threadsafety

Integer constant stating the level of thread safety the interface supports. The Snowflake Connector for Python supports level 2, which states that threads can share the module and connections.

paramstyle

String constant stating the type of parameter marker formatting expected by the interface. The connector supports the "pyformat" type by default, which applies to Python extended format codes (e.g. ...WHERE name=%s or ...WHERE name=%(name)s). Connection.connect can override paramstyle to change the bind variable formats to "qmark" or "numeric", where the variables are ? or :N, respectively.

For example:

format: .execute("... WHERE my_column = %s", (value,))
pyformat: .execute("... WHERE my_column = %(name)s", {"name": value})
qmark: .execute("... WHERE my_column = ?", (value,))
numeric: .execute("... WHERE my_column = :1", (value,))
Copy

Note

The binding variable occurs on the client side if paramstyle is "pyformat" or "format", and on the server side if "qmark" or "numeric". Currently, there is no significant difference between those options in terms of performance or features because the connector doesn’t support compiling SQL text followed by multiple executions. Instead, the "qmark" and "numeric" options align with the query text compatibility of other drivers (i.e. JDBC, ODBC, Go Snowflake Driver), which support server side bindings with the variable format ? or :N.

Functions

connect(parameters...)
Purpose:

Constructor for creating a connection to the database. Returns a Connection object.

By default, autocommit mode is enabled (i.e. if the connection is closed, all changes are committed). If you need a transaction, use the BEGIN command to start the transaction, and COMMIT or ROLLBACK to commit or roll back any changes.

Parameters:

The valid input parameters are:

Parameter

Required

Description

account

Yes

Your account identifier. The account identifier does not include the snowflakecomputing.com suffix. . . For details and examples, see Usage Notes (in this topic).

user

Yes

Login name for the user.

password

Yes

Password for the user.

region

Deprecated This description of the parameter is for backwards compatibility only.

host

No longer used Host name. Used internally only (i.e. does not need to be set).

port

No longer used Port number (443 by default). Used internally only (i.e. does not need to be set).

database

Name of the default database to use. After login, you can use USE DATABASE to change the database.

schema

Name of the default schema to use for the database. After login, you can use USE SCHEMA to change the schema.

role

Name of the default role to use. After login, you can use USE ROLE to change the role.

warehouse

Name of the default warehouse to use. After login, you can use USE WAREHOUSE to change the warehouse.

passcode_in_password

False by default. Set this to True if the MFA (Multi-Factor Authentication) passcode is embedded in the login password.

passcode

The passcode provided by Duo when using MFA (Multi-Factor Authentication) for login.

private_key

The private key used for authentication. For more information, see Using key-pair authentication and key-pair rotation.

private_key_file

Specifies the path to the private key file for the specified user. See Using key-pair authentication and key-pair rotation.

private_key_file_pwd

Specifies the passphrase to decrypt the private key file for the specified user. See Using key-pair authentication and key-pair rotation.

autocommit

None by default, which honors the Snowflake parameter AUTOCOMMIT. Set to True or False to enable or disable autocommit mode in the session, respectively.

client_prefetch_threads

Number of threads used to download the results sets (4 by default). Increasing the value improves fetch performance but requires more memory.

client_session_keep_alive

To keep the session active indefinitely (even if there is no activity from the user), set this to True. When setting this to True, call the close method to terminate the thread properly; otherwise, the process might hang.

The default value depends on the version of the connector that you are using:

  • Version 2.4.6 and later: None by default. . When the value is None, the CLIENT_SESSION_KEEP_ALIVE session parameter takes precedence. . . To override the session parameter, pass in True or False for this argument.

  • Version 2.4.5 and earlier: False by default. . When the value is False (either by specifying the value explicitly or by omitting the argument), the CLIENT_SESSION_KEEP_ALIVE session parameter takes precedence. . . Passing client_session_keey_alive=False to the connect method does not override the value TRUE in the CLIENT_SESSION_KEEP_ALIVE session parameter.

login_timeout

Timeout in seconds for login. By default, 60 seconds. The login request gives up after the timeout length if the HTTP response is “success”.

network_timeout

Timeout in seconds for all other operations. By default, none/infinite. A general request gives up after the timeout length if the HTTP response is not “success”.

ocsp_response_cache_filename

URI for the OCSP response cache file. By default, the OCSP response cache file is created in the cache directory:

  • Linux: ~/.cache/snowflake/ocsp_response_cache

  • macOS: ~/Library/Caches/Snowflake/ocsp_response_cache

  • Windows: %USERPROFILE%AppDataLocalSnowflakeCachesocsp_response_cache

To locate the file in a different directory, specify the path and file name in the URI (e.g. file:///tmp/my_ocsp_response_cache.txt).

authenticator

Authenticator for Snowflake:

  • snowflake (default) to use the internal Snowflake authenticator.

  • externalbrowser to authenticate using your web browser and Okta, AD FS, or any other SAML 2.0-compliant identity provider (IdP) that has been defined for your account.

  • https://<okta_account_name>.okta.com (i.e. the URL endpoint for your Okta account) to authenticate through native Okta.

  • oauth to authenticate using OAuth. You must also specify the token parameter and set its value to the OAuth access token.

If the value is not snowflake, the user and password parameters must be your login credentials for the IdP.

validate_default_parameters

False by default. If True, then:

  • Raise an exception if the specified database, schema, or warehouse doesn’t exist.

  • Print a warning to stderr if an invalid argument name or an argument value of the wrong data type is passed.

paramstyle

pyformat by default for client side binding. Specify qmark or numeric to change bind variable formats for server side binding.

timezone

None by default, which honors the Snowflake parameter TIMEZONE. Set to a valid time zone (e.g. America/Los_Angeles) to set the session time zone.

arrow_number_to_decimal

False by default, which means that NUMBER column values are returned as double-precision floating point numbers (float64). . . Set this to True to return DECIMAL column values as decimal numbers (decimal.Decimal) when calling the fetch_pandas_all() and fetch_pandas_batches() methods. . . This parameter was introduced in version 2.4.3 of the Snowflake Connector for Python.

socket_timeout

Timeout in seconds for socket-level read and connect requests. For more information, see Managing connection timeouts.

backoff_policy

Name of the generator function that defines how long to wait between retries. For more information, see Managing connection backoff policies for retries.

Attributes

Error, Warning, ...

All exception classes defined by the Python database API standard. The Snowflake Connector for Python provides the attributes msg, errno, sqlstate, sfqid and raw_msg.

Usage notes for the account parameter (for the connect method)

For the required account parameter, specify your account identifier.

Note that the account identifier does not include the snowflakecomputing.com domain name. Snowflake automatically appends this when creating the connection.

The following example uses an account identifier that specifies the myaccount in the organization myorganization:

ctx = snowflake.connector.connect(
    user='<user_name>',
    password='<password>',
    account='myorganization-myaccount',
    ... )
Copy

The following example uses the account locator xy12345 as the account identifier:

ctx = snowflake.connector.connect(
    user='<user_name>',
    password='<password>',
    account='xy12345',
    ... )
Copy

Note that this example uses an account in the AWS US West (Oregon) region. If the account is in a different region or if the account uses a different cloud provider, you need to specify additional segments after the account locator.

Object: Connection

A Connection object holds the connection and session information to keep the database connection active. If it is closed or the session expires, any subsequent operations will fail.

Methods

autocommit(True|False)
Purpose:

Enables or disables autocommit mode. By default, autocommit is enabled (True).

close()
Purpose:

Closes the connection. If a transaction is still open when the connection is closed, the changes are rolled back.

Closing the connection explicitly removes the active session from the server; otherwise, the active session continues until it is eventually purged from the server, limiting the number of concurrent queries.

For example:

# context manager ensures the connection is closed
with snowflake.connector.connect(...) as con:
    con.cursor().execute(...)

# try & finally to ensure the connection is closed.
con = snowflake.connector.connect(...)
try:
    con.cursor().execute(...)
finally:
    con.close()
Copy
commit()
Purpose:

If autocommit is disabled, commits the current transaction. If autocommit is enabled, this method is ignored.

rollback()
Purpose:

If autocommit is disabled, rolls back the current transaction. If autocommit is enabled, this method is ignored.

cursor()
Purpose:

Constructor for creating a Cursor object. The return values from fetch*() calls will be a single sequence or list of sequences.

cursor(snowflake.connector.DictCursor)
Purpose:

Constructor for creating a DictCursor object. The return values from fetch*() calls will be a single dict or list of dict objects. This is useful for fetching values by column name from the results.

execute_string(sql_text, remove_comments=False, return_cursors=True)
Purpose:

Execute one or more SQL statements passed as strings. If remove_comments is set to True, comments are removed from the query. If return_cursors is set to True, this method returns a sequence of Cursor objects in the order of execution.

Example:

This example shows executing multiple commands in a single string and then using the sequence of cursors that is returned:

cursor_list = connection1.execute_string(
    "SELECT * FROM testtable WHERE col1 LIKE 'T%';"
    "SELECT * FROM testtable WHERE col2 LIKE 'A%';"
    )

for cursor in cursor_list:
   for row in cursor:
      print(row[0], row[1])
Copy

Note

Methods such as execute_string() that allow multiple SQL statements in a single string are vulnerable to SQL injection attacks. Avoid using string concatenation, or functions such as Python’s format() function, to dynamically compose a SQL statement by combining SQL with data from users unless you have validated the user data. The example below demonstrates the problem:

# "Binding" data via the format() function (UNSAFE EXAMPLE)
value1_from_user = "'ok3'); DELETE FROM testtable WHERE col1 = 'ok1'; select pi("
sql_cmd = "insert into testtable(col1) values('ok1'); "                  \
          "insert into testtable(col1) values('ok2'); "                  \
          "insert into testtable(col1) values({col1});".format(col1=value1_from_user)
# Show what SQL Injection can do to a composed statement.
print(sql_cmd)

connection1.execute_string(sql_cmd)
Copy

The dynamically-composed statement looks like the following (newlines have been added for readability):

insert into testtable(col1) values('ok1');
insert into testtable(col1) values('ok2');
insert into testtable(col1) values('ok3');
DELETE FROM testtable WHERE col1 = 'ok1';
select pi();
Copy

If you are combining SQL statements with strings entered by untrusted users, then it is safer to bind data to a statement than to compose a string. The execute_string() method doesn’t take binding parameters, so to bind parameters use Cursor.execute() or Cursor.executemany().

execute_stream(sql_stream, remove_comments=False)
Purpose:

Execute one or more SQL statements passed as a stream object. If remove_comments is set to True, comments are removed from the query. This generator yields each Cursor object as SQL statements run.

get_query_status(query_id)
Purpose:

Returns the status of a query.

Parameters:

query_id

The ID of the query. See Retrieving the Snowflake query ID.

Returns:

Returns the QueryStatus object that represents the status of the query.

Example:

See Checking the status of a query.

get_query_status_throw_if_error(query_id)
Purpose:

Returns the status of a query. If the query results in an error, this method raises a ProgrammingError (as the execute() method would).

Parameters:

query_id

The ID of the query. See Retrieving the Snowflake query ID.

Returns:

Returns the QueryStatus object that represents the status of the query.

Example:

See Checking the status of a query.

is_still_running(query_status)
Purpose:

Returns True if the query status indicates that the query has not yet completed or is still in process.

Parameters:

query_status

The QueryStatus object that represents the status of the query. To get this object for a query, see Checking the status of a query.

Example:

See Checking the status of a query.

is_an_error(query_status)
Purpose:

Returns True if the query status indicates that the query resulted in an error.

Parameters:

query_status

The QueryStatus object that represents the status of the query. To get this object for a query, see Checking the status of a query.

Example:

See Checking the status of a query.

Attributes

expired

Tracks whether the connection’s master token has expired.

messages

The list object including sequences (exception class, exception value) for all messages received from the underlying database for this connection.

The list is cleared automatically by any method call.

errorhandler

Read/Write attribute that references an error handler to call in case an error condition is met.

The handler must be a Python callable that accepts the following arguments:

errorhandler(connection, cursor, errorclass, errorvalue)

Error, Warning, ...

All exception classes defined by the Python database API standard.

Object: Cursor

A Cursor object represents a database cursor for execute and fetch operations. Each cursor has its own attributes, description and rowcount, such that cursors are isolated.

Methods

close()
Purpose:

Closes the cursor object.

describe(command [, parameters][, timeout][, file_stream])
Purpose:

Returns metadata about the result set without executing a database command. This returns the same metadata that is available in the description attribute after executing a query.

This method was introduced in version 2.4.6 of the Snowflake Connector for Python.

Parameters:

See the parameters for the execute() method.

Returns:

Returns a list of ResultMetadata objects that describe the columns in the result set.

Example:

See Retrieving column metadata.

execute(command [, parameters][, timeout][, file_stream])
Purpose:

Prepares and executes a database command.

Parameters:

command

A string containing the SQL statement to execute.

parameters

(Optional) If you used parameters for binding data in the SQL statement, set this to the list or dictionary of variables that should be bound to those parameters.

For more information about mapping the Python data types for the variables to the SQL data types of the corresponding columns, see Data type mappings for qmark and numeric bindings.

timeout

(Optional) Number of seconds to wait for the query to complete. If the query has not completed after this time has passed, the query should be aborted.

file_stream

(Optional) When executing a PUT command, you can use this parameter to upload an in-memory file-like object (e.g. the I/O object returned from the Python open() function), rather than a file on the filesystem. Set this parameter to that I/O object.

When specifying the URI for the data file in the PUT command:

  • You can use any directory path. The directory path that you specify in the URI is ignored.

  • For the filename, specify the name of the file that should be created on the stage.

For example, to upload a file from a file stream to a file named:

@mystage/myfile.csv
Copy

use the following call:

cursor.execute(
    "PUT file://this_directory_path/is_ignored/myfile.csv @mystage",
    file_stream=<io_object>)
Copy
Returns:

Returns the reference of a Cursor object.

executemany(command, seq_of_parameters)
Purpose:

Prepares a database command and executes it against all parameter sequences found in seq_of_parameters. You can use this method to perform a batch insert operation.

Parameters:

command

The command is a string containing the code to execute. The string should contain one or more placeholders (such as question marks) for Binding data.

For example:

"insert into testy (v1, v2) values (?, ?)"
Copy

seq_of_parameters

This should be a sequence (list or tuple) of lists or tuples. See the example code below for example sequences.

Returns:

Returns the reference of a Cursor object.

Example:
# This example uses qmark (question mark) binding, so
# you must configure the connector to use this binding style.
from snowflake import connector
connector.paramstyle='qmark'

stmt1 = "create table testy (V1 varchar, V2 varchar)"
cs.execute(stmt1)

# A list of lists
sequence_of_parameters1 = [ ['Smith', 'Ann'], ['Jones', 'Ed'] ]
# A tuple of tuples
sequence_of_parameters2 = ( ('Cho', 'Kim'), ('Cooper', 'Pat') )

stmt2 = "insert into testy (v1, v2) values (?, ?)"
cs.executemany(stmt2, sequence_of_parameters1)
cs.executemany(stmt2, sequence_of_parameters2)
Copy

Internally, multiple execute methods are called and the result set from the last execute call will remain.

Note

The executemany method can only be used to execute a single parameterized SQL statement and pass multiple bind values to it.

Executing multiple SQL statements separated by a semicolon in one execute call is not supported. Instead, issue a separate execute call for each statement.

execute_async(...)
Purpose:

Prepares and submits a database command for asynchronous execution. See Performing an asynchronous query.

Parameters:

This method uses the same parameters as the execute() method.

Returns:

Returns the reference of a Cursor object.

Example:

See Examples of asynchronous queries.

fetch_arrow_all()
Purpose:

This method fetches all the rows in a cursor and loads them into a PyArrow table.

Parameters:

None.

Returns:

Returns a PyArrow table containing all the rows from the result set.

If there are no rows, this returns None.

Example:

See Distributing workloads that fetch results with the Snowflake Connector for Python.

fetch_arrow_batches()
Purpose:

This method fetches a subset of the rows in a cursor and delivers them to a PyArrow table.

Parameters:

None.

Returns:

Returns a PyArrow table containing a subset of the rows from the result set.

Returns None if there are no more rows to fetch.

Example:

See Distributing workloads that fetch results with the Snowflake Connector for Python.

get_result_batches()
Purpose:

Returns a list of ResultBatch objects that you can use to fetch a subset of rows from the result set.

Parameters:

None.

Returns:

Returns a list of ResultBatch objects or None if the query has not finished executing.

Example:

See Distributing workloads that fetch results with the Snowflake Connector for Python.

get_results_from_sfqid(query_id)
Purpose:

Retrieves the results of an asynchronous query or a previously submitted synchronous query.

Parameters:

query_id

The ID of the query. See Retrieving the Snowflake query ID.

Example:

See Using the query ID to retrieve the results of a query.

fetchone()
Purpose:

Fetches the next row of a query result set and returns a single sequence/dict or None when no more data is available.

fetchmany([size=cursor.arraysize])
Purpose:

Fetches the next rows of a query result set and returns a list of sequences/dict. An empty sequence is returned when no more rows are available.

fetchall()
Purpose:

Fetches all or remaining rows of a query result set and returns a list of sequences/dict.

fetch_pandas_all()
Purpose:

This method fetches all the rows in a cursor and loads them into a pandas DataFrame.

Parameters:

None.

Returns:

Returns a DataFrame containing all the rows from the result set. For more information about pandas data frames, see the pandas DataFrame documentation.

If there are no rows, this returns None.

Usage Notes:
  • This method is not a complete replacement for the read_sql() method of pandas; this method is to provide a fast way to retrieve data from a SELECT query and store the data in a pandas DataFrame.

  • Currently, this method works only for SELECT statements.

Examples:
ctx = snowflake.connector.connect(
          host=host,
          user=user,
          password=password,
          account=account,
          warehouse=warehouse,
          database=database,
          schema=schema,
          protocol='https',
          port=port)

# Create a cursor object.
cur = ctx.cursor()

# Execute a statement that will generate a result set.
sql = "select * from t"
cur.execute(sql)

# Fetch the result set from the cursor and deliver it as the pandas DataFrame.
df = cur.fetch_pandas_all()

# ...
Copy
fetch_pandas_batches()
Purpose:

This method fetches a subset of the rows in a cursor and delivers them to a pandas DataFrame.

Parameters:

None.

Returns:

Returns a DataFrame containing a subset of the rows from the result set. For more information about pandas data frames, see the pandas DataFrame documentation.

Returns None if there are no more rows to fetch.

Usage Notes:
  • Depending upon the number of rows in the result set, as well as the number of rows specified in the method call, the method might need to be called more than once, or it might return all rows in a single batch if they all fit.

  • This method is not a complete replacement for the read_sql() method of pandas; this method is to provide a fast way to retrieve data from a SELECT query and store the data in a pandas DataFrame.

  • Currently, this method works only for SELECT statements.

Examples:
ctx = snowflake.connector.connect(
          host=host,
          user=user,
          password=password,
          account=account,
          warehouse=warehouse,
          database=database,
          schema=schema,
          protocol='https',
          port=port)

# Create a cursor object.
cur = ctx.cursor()

# Execute a statement that will generate a result set.
sql = "select * from t"
cur.execute(sql)

# Fetch the result set from the cursor and deliver it as the pandas DataFrame.
for df in cur.fetch_pandas_batches():
    my_dataframe_processing_function(df)

# ...
Copy
__iter__()

Returns self to make cursors compatible with the iteration protocol.

Attributes

description

Read-only attribute that returns metadata about the columns in the result set.

This attribute is set after you call the execute() method to execute the query. (In version 2.4.6 or later, you can retrieve this metadata without executing the query by calling the describe() method.)

This attribute is set to one of the following:

  • Versions 2.4.5 and earlier: This attribute is set to a list of tuples.

  • Versions 2.4.6 and later: This attribute is set to a list of ResultMetadata objects.

Each tuple or ResultMetadata object contains the metadata that describes a column in the result set. You can access the metadata by index or, in versions 2.4.6 and later, by ResultMetadata object attribute:

Index of Value

ResultMetadata Attribute

Description

0

name

Column name.

1

type_code

Internal type code.

2

display_size

(Not used. Same as internal_size.)

3

internal_size

Internal data size.

4

precision

Precision of numeric data.

5

scale

Scale for numeric data.

6

is_nullable

True if NULL values allowed for the column or False.

For examples of getting this attribute, see Retrieving column metadata.

rowcount

Read-only attribute that returns the number of rows in the last execute produced. The value is -1 or None if no execute is executed.

sfqid

Read-only attribute that returns the Snowflake query ID in the last execute or execute_async executed.

arraysize

Read/write attribute that specifies the number of rows to fetch at a time with fetchmany(). It defaults to 1 meaning to fetch a single row at a time.

connection

Read-only attribute that returns a reference to the Connection object on which the cursor was created.

messages

List object that includes the sequences (exception class, exception value) for all messages which it received from the underlying database for the cursor.

The list is cleared automatically by any method call except for fetch*() calls.

errorhandler

Read/write attribute that references an error handler to call in case an error condition is met.

The handler must be a Python callable that accepts the following arguments:

errorhandler(connection, cursor, errorclass, errorvalue)

Type codes

In the Cursor object, the description attribute and the describe() method provide a list of tuples (or, in versions 2.4.6 and later, ResultMetadata objects) that describe the columns in the result set.

In a tuple, the value at the index 1 (the type_code attribute In the ResultMetadata object) represents the column data type. The Snowflake Connector for Python uses the following map to get the string representation, based on the type code:

type_code

String Representation

Data Type

0

FIXED

NUMBER/INT

1

REAL

REAL

2

TEXT

VARCHAR/STRING

3

DATE

DATE

4

TIMESTAMP

TIMESTAMP

5

VARIANT

VARIANT

6

TIMESTAMP_LTZ

TIMESTAMP_LTZ

7

TIMESTAMP_TZ

TIMESTAMP_TZ

8

TIMESTAMP_NTZ

TIMESTAMP_TZ

9

OBJECT

OBJECT

10

ARRAY

ARRAY

11

BINARY

BINARY

12

TIME

TIME

13

BOOLEAN

BOOLEAN

Data type mappings for qmark and numeric bindings

If paramstyle is either "qmark" or "numeric", the following default mappings from Python to Snowflake data type are used:

Python Data Type

Data Type in Snowflake

int

NUMBER(38, 0)

long

NUMBER(38, 0)

decimal

NUMBER(38, <scale>)

float

REAL

str

TEXT

unicode

TEXT

bytes

BINARY

bytearray

BINARY

bool

BOOLEAN

date

DATE

time

TIME

timedelta

TIME

datetime

TIMESTAMP_NTZ

struct_time

TIMESTAMP_NTZ

If you need to map to another Snowflake type (e.g. datetime to TIMESTAMP_LTZ), specify the Snowflake data type in a tuple consisting of the Snowflake data type followed by the value. See Binding datetime with TIMESTAMP for examples.

Object: Exception

PEP-249 defines the exceptions that the Snowflake Connector for Python can raise in case of errors or warnings. The application must handle them properly and decide to continue or stop running the code.

Methods

No methods are available for Exception objects.

Attributes

errno

Snowflake DB error code.

msg

Error message including error code, SQL State code and query ID.

raw_msg

Error message. No error code, SQL State code or query ID is included.

sqlstate

ANSI-compliant SQL State code

sfqid

Snowflake query ID.

Object ResultBatch

A ResultBatch object encapsulates a function that retrieves a subset of rows in a result set. To distribute the work of fetching results across multiple workers or nodes, you can call get_result_batches() method in the Cursor object to retrieve a list of ResultBatch objects and distribute these objects to different workers or nodes for processing.

Attributes

rowcount

Read-only attribute that returns the number of rows in the result batch.

compressed_size

Read-only attribute that returns the size of the data (when compressed) in the result batch.

uncompressed_size

Read-only attribute that returns the size of the data (uncompressed) in the result batch.

Methods

to_arrow()
Purpose:

This method returns a PyArrow table containing the rows in the ResultBatch object.

Parameters:

None.

Returns:

Returns a PyArrow table containing the rows from the ResultBatch object.

If there are no rows, this returns None.

to_pandas()
Purpose:

This method returns a pandas DataFrame containing the rows in the ResultBatch object.

Parameters:

None.

Returns:

Returns a pandas DataFrame containing the rows from the ResultBatch object.

If there are no rows, this returns an empty pandas DataFrame.

Object: ResultMetadata

A ResultMetadata object represents metadata about a column in the result set. A list of these objects is returned by the description attribute and describe method of the Cursor object.

This object was introduced in version 2.4.6 of the Snowflake Connector for Python.

Methods

None.

Attributes

name

Name of the column

type_code

Internal type code.

display_size

Not used. Same as internal_size.

internal_size

Internal data size.

precision

Precision of numeric data.

scale

Scale for numeric data.

is_nullable

True if NULL values allowed for the column or False.

Module: snowflake.connector.constants

The snowflake.connector.constants module defines constants used in the API.

Enums

class QueryStatus

Represents the status of an asynchronous query. This enum has the following constants:

Enum Constant

Description

RUNNING

The query is still running.

ABORTING

The query is in the process of being aborted on the server side.

SUCCESS

The query finished successfully.

FAILED_WITH_ERROR

The query finished unsuccessfully.

QUEUED

The query is queued for execution (i.e. has not yet started running), typically because it is waiting for resources.

DISCONNECTED

The session’s connection is broken. The query’s state will change to “FAILED_WITH_ERROR” soon.

RESUMING_WAREHOUSE

The warehouse is starting up and the query is not yet running.

BLOCKED

The statement is waiting on a lock held by another statement.

NO_DATA

Data about the statement is not yet available, typically because the statement has not yet started executing.

Module: snowflake.connector.pandas_tools

The snowflake.connector.pandas_tools module provides functions for working with the pandas data analysis library.

Functions

write_pandas(parameters...)
Purpose:

Writes a pandas DataFrame to a table in a Snowflake database.

To write the data to the table, the function saves the data to Parquet files, uses the PUT command to upload these files to a temporary stage, and uses the COPY INTO <table> command to copy the data from the files to the table. You can use some of the function parameters to control how the PUT and COPY INTO <table> statements are executed.

Parameters:

The valid input parameters are:

Parameter

Required

Description

conn

Yes

Connection object that holds the connection to the Snowflake database.

df

Yes

pandas.DataFrame object containing the data to be copied into the table.

table_name

Yes

Name of the table where the data should be copied.

database

Name of the database containing the table. By default, the function writes to the database that is currently in use in the session. Note: If you specify this parameter, you must also specify the schema parameter.

schema

Name of the schema containing the table. By default, the function writes to the table in the schema that is currently in use in the session.

chunk_size

Number of elements to insert at a time. By default, the function inserts all elements at once in one chunk.

compression

The compression algorithm to use for the Parquet files. You can specify either "gzip" for better compression or "snappy" for faster compression. By default, the function uses "gzip".

on_error

Specifies how errors should be handled. Set this to one of the string values documented in the ON_ERROR copy option. By default, the function uses "ABORT_STATEMENT".

parallel

Number of threads to use when uploading the Parquet files to the temporary stage. For the default number of threads used and guidelines on choosing the number of threads, see the parallel parameter of the PUT command.

quote_identifiers

If False, prevents the connector from putting double quotes around identifiers before sending the identifiers to the server. By default, the connector puts double quotes around identifiers.

Returns:

Returns a tuple of (success, num_chunks, num_rows, output) where:

  • success is True if the function successfully wrote the data to the table.

  • num_chunks is the number of chunks of data that the function copied.

  • num_rows is the number of rows that the function inserted.

  • output is the output of the COPY INTO <table> command.

Example:

The following example writes the data from a pandas DataFrame to the table named ‘customers’.

import pandas
from snowflake.connector.pandas_tools import write_pandas

# Create the connection to the Snowflake database.
cnx = snowflake.connector.connect(...)

# Create a DataFrame containing data about customers
df = pandas.DataFrame([('Mark', 10), ('Luke', 20)], columns=['name', 'balance'])

# Write the data from the DataFrame to the table named "customers".
success, nchunks, nrows, _ = write_pandas(cnx, df, 'customers')
Copy
pd_writer(parameters...)
Purpose:

pd_writer is an insertion method for inserting data into a Snowflake database.

When calling pandas.DataFrame.to_sql (see the pandas documentation), pass in method=pd_writer to specify that you want to use pd_writer as the method for inserting data. (You do not need to call pd_writer from your own code. The to_sql method calls pd_writer and supplies the input parameters needed.)

Note

Please note that when column names in the pandas DataFrame contain only lowercase letters, you must enclose the column names in double quotes; otherwise the connector raises a ProgrammingError.

The snowflake-sqlalchemy library does not quote lowercase column names when creating a table, while pd_writer quotes column names by default. The issue arises because the COPY INTO command expects column names to be quoted.

Future improvements will be made in the snowflake-sqlalchemy library.

For example:

import pandas as pd
from snowflake.connector.pandas_tools import pd_writer

sf_connector_version_df = pd.DataFrame([('snowflake-connector-python', '1.0')], columns=['NAME', 'NEWEST_VERSION'])

# Specify that the to_sql method should use the pd_writer function
# to write the data from the DataFrame to the table named "driver_versions"
# in the Snowflake database.
sf_connector_version_df.to_sql('driver_versions', engine, index=False, method=pd_writer)

# When the column names consist of only lower case letters, quote the column names
sf_connector_version_df = pd.DataFrame([('snowflake-connector-python', '1.0')], columns=['"name"', '"newest_version"'])
sf_connector_version_df.to_sql('driver_versions', engine, index=False, method=pd_writer)
Copy

The pd_writer function uses the write_pandas() function to write the data in the DataFrame to the Snowflake database.

Parameters:

The valid input parameters are:

Parameter

Required

Description

table

Yes

pandas.io.sql.SQLTable object for the table.

conn

Yes

sqlalchemy.engine.Engine or sqlalchemy.engine.Connection object used to connect to the Snowflake database.

keys

Yes

Names of the table columns for the data to be inserted.

data_iter

Yes

Iterator for the rows containing the data to be inserted.

Example:

The following example passes method=pd_writer to the pandas.DataFrame.to_sql method, which in turn calls the pd_writer function to write the data in the pandas DataFrame to a Snowflake database.

import pandas
from snowflake.connector.pandas_tools import pd_writer

# Create a DataFrame containing data about customers
df = pandas.DataFrame([('Mark', 10), ('Luke', 20)], columns=['name', 'balance'])

# Specify that the to_sql method should use the pd_writer function
# to write the data from the DataFrame to the table named "customers"
# in the Snowflake database.
df.to_sql('customers', engine, index=False, method=pd_writer)
Copy

Date and timestamp support

Snowflake supports multiple DATE and TIMESTAMP data types, and the Snowflake Connector allows binding native datetime and date objects for update and fetch operations.

Fetching data

When fetching date and time data, the Snowflake data types are converted into Python data types:

Snowflake Data Types

Python Data Type

Behavior

TIMESTAMP_TZ

datetime with tzinfo

Fetches data, including the time zone offset, and translates it into a datetime with tzinfo object.

TIMESTAMP_LTZ, TIMESTAMP

datetime with tzinfo

Fetches data, translates it into a datetime object, and attaches tzinfo based on the TIMESTAMP_TYPE_MAPPING session parameter.

TIMESTAMP_NTZ

datetime

Fetches data and translates it into a datetime object. No time zone information is attached to the object.

DATE

date

Fetches data and translates it into a date object. No time zone information is attached to the object.

Note

tzinfo is a UTC offset-based time zone object and not IANA time zone names. The time zone names might not match, but equivalent offset-based time zone objects are considered identical.

Updating data

When updating date and time data, the Python data types are converted to Snowflake data types:

Python Data Type

Snowflake Data Types

Behavior

datetime

TIMESTAMP_TZ, TIMESTAMP_LTZ, TIMESTAMP_NTZ, DATE

Converts a datetime object into a string in the format of YYYY-MM-DD HH24:MI:SS.FF TZH:TZM and updates it. If no time zone offset is provided, the string will be in the format of YYYY-MM-DD HH24:MI:SS.FF. The user is responsible for setting the tzinfo for the datetime object.

struct_time

TIMESTAMP_TZ, TIMESTAMP_LTZ, TIMESTAMP_NTZ, DATE

Converts a struct_time object into a string in the format of YYYY-MM-DD HH24:MI:SS.FF TZH:TZM and updates it. The time zone information is retrieved from time.timezone, which includes the time zone offset from UTC. The user is responsible for setting the TZ environment variable for time.timezone.

date

TIMESTAMP_TZ, TIMESTAMP_LTZ, TIMESTAMP_NTZ, DATE

Converts a date object into a string in the format of YYYY-MM-DD. No time zone is considered.

time

TIMESTAMP_TZ, TIMESTAMP_LTZ, TIMESTAMP_NTZ, DATE

Converts a time object into a string in the format of HH24:MI:SS.FF. No time zone is considered.

timedelta

TIMESTAMP_TZ, TIMESTAMP_LTZ, TIMESTAMP_NTZ, DATE

Converts a timedelta object into a string in the format of HH24:MI:SS.FF. No time zone is considered.