Python Connector API¶
The Snowflake Connector for Python implements the Python Database API v2.0 specification (PEP-249). This topic covers the standard API and the Snowflake-specific extensions.
For more information, see the PEP-249 documentation.
Module: snowflake.connector¶
The main module is snowflake.connector, which creates a Connection object and provides
Error classes.
Constants¶
- apilevel¶
String constant stating the supported API level. The connector supports API
"2.0".
- threadsafety¶
Integer constant stating the level of thread safety the interface supports. The Snowflake Connector for Python supports level
2, which states that threads can share the module and connections.
- paramstyle¶
String constant stating the type of parameter marker formatting expected by the interface. The connector supports the
"pyformat"type by default, which applies to Python extended format codes (e.g....WHERE name=%sor...WHERE name=%(name)s).Connection.connectcan overrideparamstyleto change the bind variable formats to"qmark"or"numeric", where the variables are?or:N, respectively.For example:
format: .execute("... WHERE my_column = %s", (value,)) pyformat: .execute("... WHERE my_column = %(name)s", {"name": value}) qmark: .execute("... WHERE my_column = ?", (value,)) numeric: .execute("... WHERE my_column = :1", (value,))
Note
The binding variable occurs on the client side if
paramstyleis"pyformat"or"format", and on the server side if"qmark"or"numeric". Currently, there is no significant difference between those options in terms of performance or features because the connector doesn’t support compiling SQL text followed by multiple executions. Instead, the"qmark"and"numeric"options align with the query text compatibility of other drivers (i.e. JDBC, ODBC, Go Snowflake Driver), which support server side bindings with the variable format?or:N.
Functions¶
- connect(parameters...)¶
- Purpose:
Constructor for creating a connection to the database. Returns a
Connectionobject.By default, autocommit mode is enabled (i.e. if the connection is closed, all changes are committed). If you need a transaction, use the BEGIN command to start the transaction, and COMMIT or ROLLBACK to commit or roll back any changes.
- Parameters:
The valid input parameters are:
Parameter
Required
Description
accountYes
Your account identifier. The account identifier does not include the
snowflakecomputing.comsuffix. . . For details and examples, see Usage Notes (in this topic).userYes
Login name for the user.
passwordYes
Password for the user.
applicationName that identifies the application making the connection.
regionDeprecated This description of the parameter is for backwards compatibility only..
hostHost name.
portPort number (
443by default).databaseName of the default database to use. After login, you can use USE DATABASE to change the database.
schemaName of the default schema to use for the database. After login, you can use USE SCHEMA to change the schema.
roleName of the default role to use. After login, you can use USE ROLE to change the role.
warehouseName of the default warehouse to use. After login, you can use USE WAREHOUSE to change the warehouse..
passcode_in_passwordFalseby default. Set this toTrueif the MFA (Multi-Factor Authentication) passcode is embedded in the login password.passcodeThe passcode provided by Duo when using MFA (Multi-Factor Authentication) for login.
private_keyThe private key used for authentication. For more information, see Using key-pair authentication and key-pair rotation.
private_key_fileSpecifies the path to the private key file for the specified user. See Using key-pair authentication and key-pair rotation.
private_key_file_pwdSpecifies the passphrase to decrypt the private key file for the specified user. See Using key-pair authentication and key-pair rotation.
autocommitNoneby default, which honors the Snowflake parameter AUTOCOMMIT. Set toTrueorFalseto enable or disable autocommit mode in the session, respectively.client_fetch_use_mpWhen set to
True, it enables multi-processed fetching, which for many cases should reduce the fetching time. Default:False.client_prefetch_threadsNumber of threads used to download the results sets (
4by default). Increasing the value improves fetch performance but requires more memory.client_session_keep_aliveTo keep the session active indefinitely (even if there is no activity from the user), set this to
True. When setting this toTrue, call theclosemethod to terminate the thread properly; otherwise, the process might hang. The default value depends on the version of the connector that you are using:Version 2.4.6 and later:
Noneby default. . When the value isNone, the CLIENT_SESSION_KEEP_ALIVE session parameter takes precedence. . . To override the session parameter, pass inTrueorFalsefor this argument.Version 2.4.5 and earlier:
Falseby default. . When the value isFalse(either by specifying the value explicitly or by omitting the argument), the CLIENT_SESSION_KEEP_ALIVE session parameter takes precedence. . .
Passing
client_session_keep_alive=Falseto theconnectmethod does not override the valueTRUEin theCLIENT_SESSION_KEEP_ALIVEsession parameter.login_timeoutTimeout in seconds for login. By default, 60 seconds. The login request gives up after the timeout length if the HTTP response is “success”.
network_timeoutTimeout in seconds for all other operations. By default, none/infinite. A general request gives up after the timeout length if the HTTP response is not “success”.
ocsp_response_cache_filenameURI for the OCSP response cache file. By default, the OCSP response cache file is created in the cache directory:
Linux:
~/.cache/snowflake/ocsp_response_cachemacOS:
~/Library/Caches/Snowflake/ocsp_response_cacheWindows:
%USERPROFILE%AppDataLocalSnowflakeCachesocsp_response_cache
To locate the file in a different directory, specify the path and file name in the URI (e.g.
file:///tmp/my_ocsp_response_cache.txt)..authenticatorAuthenticator for Snowflake:
snowflake(default) to use the internal Snowflake authenticator.externalbrowserto authenticate using your web browser and Okta, AD FS, or any other SAML 2.0-compliant identity provider (IdP) that has been defined for your account.https://<okta_account_name>.okta.com(i.e. the URL endpoint for your Okta account) to authenticate through native Okta.oauthto authenticate using OAuth. You must also specify thetokenparameter and set its value to the OAuth access token.username_password_mfato authenticate with MFA token caching. For more details, see Using MFA token caching to minimize the number of prompts during authentication — optional.OAUTH_AUTHORIZATION_CODEto use the OAuth 2.0 Authorization Code flow.OAUTH_CLIENT_CREDENTIALSto use the OAuth 2.0 Client Credentials flow.WORKLOAD_IDENTITYto authenticate with the workload identity federation (WIF) authenticator.
If the value is not
snowflake, the user and password parameters must be your login credentials for the IdP.validate_default_parametersFalseby default. IfTrue, then:Raise an exception if the specified database, schema, or warehouse doesn’t exist.
Print a warning to stderr if an invalid argument name or an argument value of the wrong data type is passed.
paramstylepyformatby default for client side binding. Specifyqmarkornumericto change bind variable formats for server side binding.timezoneNoneby default, which honors the Snowflake parameter TIMEZONE. Set to a valid time zone (e.g.America/Los_Angeles) to set the session time zone.arrow_number_to_decimalFalseby default, which means that NUMBER column values are returned as double-precision floating point numbers (float64). . . Set this toTrueto return DECIMAL column values as decimal numbers (decimal.Decimal) when calling thefetch_pandas_all()andfetch_pandas_batches()methods. . . This parameter was introduced in version 2.4.3 of the Snowflake Connector for Python.socket_timeoutTimeout in seconds for socket-level read and connect requests. For more information, see Managing connection timeouts.
backoff_policyName of the generator function that defines how long to wait between retries. For more information, see Managing connection backoff policies for retries.
enable_connection_diagWhether to generate a connectivity diagnostic report. Default is
False.connection_diag_log_pathAbsolute path for the location of the diagnostic report. Used only if
enable_connection_diagisTrue. Default is the default temp directory for your operating system, such as/tmpfor Linux or Mac.connection_diag_allowlist_pathAbsolute path to a JSON file containing the output of
SYSTEM$ALLOWLIST()orSYSTEM$ALLOWLIST_PRIVATELINK(). Required only if the user defined in the connection does not have permission to run the system allowlist functions or if connecting to the account URL fails.iobound_tpe_limitSize of the preprocess_tpe and postprocess threadpool executors (TPEs). By default, the value is the lesser of the number of files and the number of CPU cores.
unsafe_file_writeSpecifies which file permissions to assign to files downloaded from a stage using a GET command.
False(default) sets the file permissions to600, which means only the owner can access the files.Truesets the permissions to644, which gives the owner read and write permissions and read-only permissions to everyone else. For more information, see Downloading data.oauth_client_idValue of
client idprovided by the Identity Provider for Snowflake integration (Snowflake security integration metadata).oauth_client_secretValue of the
client secretprovided by the Identity Provider for Snowflake integration (Snowflake security integration metadata).oauth_authorization_urlIdentity Provider endpoint supplying the authorization code to the driver. When using Snowflake as an Identity Provider ,this value is derived from the
serveroraccountparameters.oauth_token_request_urlIdentity Provider endpoint supplying the access tokens to the driver. When using Snowflake as an Identity Provider ,this value is derived from the
serveroraccountparameters.oauth_scopeScope requested in the Identity Provider authorization request. By default, it is derived from the role. When multiple scopes are required, the value should be a space-separated list of multiple scopes.
oauth_redirect_uriURI to use for authorization code redirection (Snowflake security integration metadata). Default:
http://127.0.0.1:{randomAvailablePort}.oauth_disable_pkceDisables Proof Key for Code Exchange (PKCE), a security enhancement that ensures that even if malicious attackers intercept an Authorization Code, they won’t be able to change it to a valid access token.
oauth_enable_refresh_tokenEnables a silent re-authentication when the actual access token becomes outdated, providing it’s supported by the Authorization Server and
client_store_temporary_credentialis set toTrue.oauth_enable_single_use_refresh_tokensWhether to opt-in to single-use refresh token semantics.
oauth_credentials_in_bodyWhether or not credentials should be sent in the body for OAuth authentication. Default is
False.workload_identity_providerPlatform of the workload identity provider. Possible values include:
AWS,AZURE,GCP, andOIDC.workload_identity_impersonation_pathAn array of strings that provide an identity chain to use when connecting to Snowflake. Array elements are either a full service account address or a service account’s unique ID.
Impersonation works by following each array entry in order to obtain a token that allows authorization of the next service account. Each account in the identity chain needs permissions to impersonate the next account only. The final account in the list obtains your Snowflake connection token and is used to connect to Snowflake.
Account impersonation is supported only for Google Cloud and AWS workloads.
unsafe_skip_file_permissions_checkWhether to skip permissions checks on file access. Default is
False.cert_revocation_check_modeCertificate revocation check mode. For accepted values, see CertRevocationCheckMode.
allow_certificates_without_crl_urlWhether to allow certificates without certificate revocation list distribution points. Default is
False.crl_connection_timeout_msThe connection timeout for certificate revocation list downloads, in milliseconds. Default is
3000.crl_read_timeout_msThe read timeout for certificate revocation list downloads, in milliseconds. Default is
3000.crl_cache_validity_hoursHow long the certificate revocation list cache is valid, in hours. Default is
24.enable_crl_cacheWhether or not to enable certificate revocation list caching. Default is
True.enable_crl_file_cacheWhether to cache certificate revocation list to disk. Applies only when
enable_crl_cacheisTrue. Default isTrue.crl_cache_dirThe directory to store the certificate revocation list cache in. Applies only when
enable_crl_file_cacheisTrue.crl_cache_removal_delay_daysThe amount of time to keep expired certificate revocation list files on disk, in days. Default is
7.crl_cache_cleanup_interval_hoursHow often to clean the certificate revocation list cache, in hours. Default is
1.crl_cache_start_cleanupWhether to run certificate revocation list cleanup activities in the background. Default is
False.ocsp_root_certs_dict_lock_timeoutTimeout for acquiring the lock on the OCSP root certs dictionary, in seconds. A value of
-1disables timeouts. Default is-1.
Attributes¶
- Error, Warning, ...
All exception classes defined by the Python database API standard. The Snowflake Connector for Python provides the attributes
msg,errno,sqlstate,sfqidandraw_msg.
Usage notes for the account parameter (for the connect method)¶
For the required account parameter, specify your account identifier.
Note that the account identifier does not include the snowflakecomputing.com domain name. Snowflake automatically
appends this when creating the connection.
The following example uses the account name as an identifier for the account myaccount in
the organization myorganization.
ctx = snowflake.connector.connect(
user='<user_name>',
password='<password>',
account='myorganization-myaccount',
... )
The following example uses the account locator xy12345 as the account identifier:
ctx = snowflake.connector.connect(
user='<user_name>',
password='<password>',
account='xy12345',
... )
Note that this example uses an account in the AWS US West (Oregon) region. If the account is in a different region or if the account uses a different cloud provider, you need to specify additional segments after the account locator.
Object: Connection¶
A Connection object holds the connection and session information to keep the database connection active. If it is closed or the session expires, any subsequent operations will fail.
Methods¶
- autocommit(True|False)¶
- Purpose:
Enables or disables autocommit mode. By default, autocommit is enabled (
True).
- close()¶
- Purpose:
Closes the connection. If a transaction is still open when the connection is closed, the changes are rolled back.
Closing the connection explicitly removes the active session from the server; otherwise, the active session continues until it is eventually purged from the server, limiting the number of concurrent queries.
For example:
# context manager ensures the connection is closed with snowflake.connector.connect(...) as con: con.cursor().execute(...) # try & finally to ensure the connection is closed. con = snowflake.connector.connect(...) try: con.cursor().execute(...) finally: con.close()
- commit()¶
- Purpose:
If autocommit is disabled, commits the current transaction. If autocommit is enabled, this method is ignored.
- rollback()¶
- Purpose:
If autocommit is disabled, rolls back the current transaction. If autocommit is enabled, this method is ignored.
- cursor()¶
- Purpose:
Constructor for creating a
Cursorobject. The return values fromfetch*()calls will be a single sequence or list of sequences.
- cursor(snowflake.connector.DictCursor)
- Purpose:
Constructor for creating a
DictCursorobject. The return values fromfetch*()calls will be a single dict or list of dict objects. This is useful for fetching values by column name from the results.
- execute_string(sql_text, remove_comments=False, return_cursors=True)¶
- Purpose:
Execute one or more SQL statements passed as strings. If
remove_commentsis set toTrue, comments are removed from the query. Ifreturn_cursorsis set toTrue, this method returns a sequence ofCursorobjects in the order of execution.- Example:
This example shows executing multiple commands in a single string and then using the sequence of cursors that is returned:
cursor_list = connection1.execute_string( "SELECT * FROM testtable WHERE col1 LIKE 'T%';" "SELECT * FROM testtable WHERE col2 LIKE 'A%';" ) for cursor in cursor_list: for row in cursor: print(row[0], row[1])
Note
Methods such as
execute_string()that allow multiple SQL statements in a single string are vulnerable to SQL injection attacks. Avoid using string concatenation, or functions such as Python’sformat()function, to dynamically compose a SQL statement by combining SQL with data from users unless you have validated the user data. The example below demonstrates the problem:# "Binding" data via the format() function (UNSAFE EXAMPLE) value1_from_user = "'ok3'); DELETE FROM testtable WHERE col1 = 'ok1'; select pi(" sql_cmd = "insert into testtable(col1) values('ok1'); " \ "insert into testtable(col1) values('ok2'); " \ "insert into testtable(col1) values({col1});".format(col1=value1_from_user) # Show what SQL Injection can do to a composed statement. print(sql_cmd) connection1.execute_string(sql_cmd)
The dynamically-composed statement looks like the following (newlines have been added for readability):
insert into testtable(col1) values('ok1'); insert into testtable(col1) values('ok2'); insert into testtable(col1) values('ok3'); DELETE FROM testtable WHERE col1 = 'ok1'; select pi();
If you are combining SQL statements with strings entered by untrusted users, then it is safer to bind data to a statement than to compose a string. The
execute_string()method doesn’t take binding parameters, so to bind parameters useCursor.execute()orCursor.executemany().
- execute_stream(sql_stream, remove_comments=False)¶
- Purpose:
Execute one or more SQL statements passed as a stream object. If
remove_commentsis set toTrue, comments are removed from the query. This generator yields eachCursorobject as SQL statements run.If
sql_streamends with comment lines, you must setremove_commentstoTrue, similar to the following:sql_script = """ -- This is first comment line; select 1; select 2; -- This is comment in middle; -- With some extra comment lines; select 3; -- This is the end with last line comment; """ sql_stream = StringIO(sql_script) with con.cursor() as cur: for result_cursor in con.execute_stream(sql_stream,remove_comments=True): for result in result_cursor: print(f"Result: {result}")
- get_query_status(query_id)¶
- Purpose:
Returns the status of a query.
- Parameters:
query_idThe ID of the query. See Retrieving the Snowflake query ID.
- Returns:
Returns the
QueryStatusobject that represents the status of the query.- Example:
- get_query_status_throw_if_error(query_id)¶
- Purpose:
Returns the status of a query. If the query results in an error, this method raises a
ProgrammingError(as theexecute()method would).- Parameters:
query_idThe ID of the query. See Retrieving the Snowflake query ID.
- Returns:
Returns the
QueryStatusobject that represents the status of the query.- Example:
- is_valid()¶
- Purpose:
Returns
Trueif the connection is stable enough to receive queries.
- is_still_running(query_status)¶
- Purpose:
Returns
Trueif the query status indicates that the query has not yet completed or is still in process.- Parameters:
query_statusThe
QueryStatusobject that represents the status of the query. To get this object for a query, see Checking the status of a query.- Example:
- is_an_error(query_status)¶
- Purpose:
Returns
Trueif the query status indicates that the query resulted in an error.- Parameters:
query_statusThe
QueryStatusobject that represents the status of the query. To get this object for a query, see Checking the status of a query.- Example:
Attributes¶
- expired¶
Tracks whether the connection’s master token has expired.
- messages¶
The list object including sequences (exception class, exception value) for all messages received from the underlying database for this connection.
The list is cleared automatically by any method call.
- errorhandler¶
Read/Write attribute that references an error handler to call in case an error condition is met.
The handler must be a Python callable that accepts the following arguments:
errorhandler(connection, cursor, errorclass, errorvalue)
- Error, Warning, ...
All exception classes defined by the Python database API standard.
Object: Cursor¶
A Cursor object represents a database cursor for execute and fetch operations.
Each cursor has its own attributes, description and rowcount, such that
cursors are isolated.
Methods¶
- close()
- Purpose:
Closes the cursor object.
- describe(command [, parameters][, timeout][, file_stream])¶
- Purpose:
Returns metadata about the result set without executing a database command. This returns the same metadata that is available in the
descriptionattribute after executing a query.This method was introduced in version 2.4.6 of the Snowflake Connector for Python.
- Parameters:
See the parameters for the
execute()method.- Returns:
Returns a list of ResultMetadata objects that describe the columns in the result set.
- Example:
- execute(command [, parameters][, timeout][, file_stream])¶
- Purpose:
Prepares and executes a database command.
- Parameters:
commandA string containing the SQL statement to execute.
parameters(Optional) If you used parameters for binding data in the SQL statement, set this to the list or dictionary of variables that should be bound to those parameters.
For more information about mapping the Python data types for the variables to the SQL data types of the corresponding columns, see Data type mappings for qmark and numeric bindings.
timeout(Optional) Number of seconds to wait for the query to complete. If the query has not completed after this time has passed, the query should be aborted.
file_stream(Optional) When executing a PUT command, you can use this parameter to upload an in-memory file-like object (e.g. the I/O object returned from the Python
open()function), rather than a file on the filesystem. Set this parameter to that I/O object.When specifying the URI for the data file in the PUT command:
You can use any directory path. The directory path that you specify in the URI is ignored.
For the filename, specify the name of the file that should be created on the stage.
For example, to upload a file from a file stream to a file named:
@mystage/myfile.csv
use the following call:
cursor.execute( "PUT file://this_directory_path/is_ignored/myfile.csv @mystage", file_stream=<io_object>)
- Returns:
Returns the reference of a
Cursorobject.
- executemany(command, seq_of_parameters)¶
- Purpose:
Prepares a database command and executes it against all parameter sequences found in
seq_of_parameters. You can use this method to perform a batch insert operation.- Parameters:
commandThe command is a string containing the code to execute. The string should contain one or more placeholders (such as question marks) for Binding data.
For example:
"insert into testy (v1, v2) values (?, ?)"seq_of_parametersThis should be a sequence (list or tuple) of lists or tuples. See the example code below for example sequences.
- Returns:
Returns the reference of a
Cursorobject.- Example:
# This example uses qmark (question mark) binding, so # you must configure the connector to use this binding style. from snowflake import connector connector.paramstyle='qmark' stmt1 = "create table testy (V1 varchar, V2 varchar)" cs.execute(stmt1) # A list of lists sequence_of_parameters1 = [ ['Smith', 'Ann'], ['Jones', 'Ed'] ] # A tuple of tuples sequence_of_parameters2 = ( ('Cho', 'Kim'), ('Cooper', 'Pat') ) stmt2 = "insert into testy (v1, v2) values (?, ?)" cs.executemany(stmt2, sequence_of_parameters1) cs.executemany(stmt2, sequence_of_parameters2)
Internally, multiple
executemethods are called and the result set from the lastexecutecall will remain.Note
The
executemanymethod can only be used to execute a single parameterized SQL statement and pass multiple bind values to it.Executing multiple SQL statements separated by a semicolon in one
executecall is not supported. Instead, issue a separateexecutecall for each statement.
- execute_async(...)¶
- Purpose:
Prepares and submits a database command for asynchronous execution. See Performing an asynchronous query.
- Parameters:
This method uses the same parameters as the
execute()method.- Returns:
Returns the reference of a
Cursorobject.- Example:
- fetch_arrow_all()¶
- Purpose:
This method fetches all the rows in a cursor and loads them into a PyArrow table.
- Parameters:
None.
- Returns:
Returns a PyArrow table containing all the rows from the result set.
If there are no rows, this returns None.
- Example:
See Distributing workloads that fetch results with the Snowflake Connector for Python.
- fetch_arrow_batches()¶
- Purpose:
This method fetches a subset of the rows in a cursor and delivers them to a PyArrow table.
- Parameters:
None.
- Returns:
Returns a PyArrow table containing a subset of the rows from the result set.
Returns None if there are no more rows to fetch.
- Example:
See Distributing workloads that fetch results with the Snowflake Connector for Python.
- get_result_batches()¶
- Purpose:
Returns a list of ResultBatch objects that you can use to fetch a subset of rows from the result set.
- Parameters:
None.
- Returns:
Returns a list of ResultBatch objects or
Noneif the query has not finished executing.- Example:
See Distributing workloads that fetch results with the Snowflake Connector for Python.
- get_results_from_sfqid(query_id)¶
- Purpose:
Retrieves the results of an asynchronous query or a previously submitted synchronous query.
- Parameters:
query_idThe ID of the query. See Retrieving the Snowflake query ID.
- Example:
- fetchone()¶
- Purpose:
Fetches the next row of a query result set and returns a single sequence/dict or
Nonewhen no more data is available.
- fetchmany([size=cursor.arraysize])¶
- Purpose:
Fetches the next rows of a query result set and returns a list of sequences/dict. An empty sequence is returned when no more rows are available.
- fetchall()¶
- Purpose:
Fetches all or remaining rows of a query result set and returns a list of sequences/dict.
- fetch_pandas_all()¶
- Purpose:
This method fetches all the rows in a cursor and loads them into a pandas DataFrame.
- Parameters:
None.
- Returns:
Returns a DataFrame containing all the rows from the result set.
For more information about pandas data frames, see the pandas DataFrame documentation .
If there are no rows, this returns
None.
- Usage Notes:
This method is not a complete replacement for the
read_sql()method of pandas; this method is to provide a fast way to retrieve data from a SELECT query and store the data in a pandas DataFrame.Currently, this method works only for SELECT statements.
- Examples:
ctx = snowflake.connector.connect( host=host, user=user, password=password, account=account, warehouse=warehouse, database=database, schema=schema, protocol='https', port=port) # Create a cursor object. cur = ctx.cursor() # Execute a statement that will generate a result set. sql = "select * from t" cur.execute(sql) # Fetch the result set from the cursor and deliver it as the pandas DataFrame. df = cur.fetch_pandas_all() # ...
- fetch_pandas_batches()¶
- Purpose:
This method fetches a subset of the rows in a cursor and delivers them to a pandas DataFrame.
- Parameters:
None.
- Returns:
Returns a DataFrame containing a subset of the rows from the result set.
For more information about pandas data frames, see the pandas DataFrame documentation.
Returns
Noneif there are no more rows to fetch.
- Usage Notes:
Depending upon the number of rows in the result set, as well as the number of rows specified in the method call, the method might need to be called more than once, or it might return all rows in a single batch if they all fit.
This method is not a complete replacement for the
read_sql()method of pandas; this method is to provide a fast way to retrieve data from a SELECT query and store the data in a pandas DataFrame.Currently, this method works only for SELECT statements.
- Examples:
ctx = snowflake.connector.connect( host=host, user=user, password=password, account=account, warehouse=warehouse, database=database, schema=schema, protocol='https', port=port) # Create a cursor object. cur = ctx.cursor() # Execute a statement that will generate a result set. sql = "select * from t" cur.execute(sql) # Fetch the result set from the cursor and deliver it as the pandas DataFrame. for df in cur.fetch_pandas_batches(): my_dataframe_processing_function(df) # ...
- __iter__()¶
Returns self to make cursors compatible with the iteration protocol.
Attributes¶
- description¶
Read-only attribute that returns metadata about the columns in the result set.
This attribute is set after you call the
execute()method to execute the query. (In version 2.4.6 or later, you can retrieve this metadata without executing the query by calling thedescribe()method.)This attribute is set to one of the following:
Versions 2.4.5 and earlier: This attribute is set to a list of tuples.
Versions 2.4.6 and later: This attribute is set to a list of ResultMetadata objects.
Each tuple or
ResultMetadataobject contains the metadata that describes a column in the result set. You can access the metadata by index or, in versions 2.4.6 and later, byResultMetadataobject attribute:Index of Value
ResultMetadata Attribute
Description
0nameColumn name.
1type_codeInternal type code.
2display_size(Not used. Same as internal_size.)
3internal_sizeInternal data size.
4precisionPrecision of numeric data.
5scaleScale for numeric data.
6is_nullableTrueif NULL values allowed for the column orFalse.For examples of getting this attribute, see Retrieving column metadata.
- rowcount¶
Read-only attribute that returns the number of rows in the last
executeproduced. The value is-1orNoneif noexecuteis executed.
- sfqid¶
Read-only attribute that returns the Snowflake query ID in the last
executeorexecute_asyncexecuted.
- arraysize¶
Read/write attribute that specifies the number of rows to fetch at a time with
fetchmany(). It defaults to1meaning to fetch a single row at a time.
- connection¶
Read-only attribute that returns a reference to the
Connectionobject on which the cursor was created.
- messages
List object that includes the sequences (exception class, exception value) for all messages which it received from the underlying database for the cursor.
The list is cleared automatically by any method call except for
fetch*()calls.
- errorhandler
Read/write attribute that references an error handler to call in case an error condition is met.
The handler must be a Python callable that accepts the following arguments:
errorhandler(connection, cursor, errorclass, errorvalue)
Type codes¶
In the Cursor object, the description attribute and the describe() method provide a list of tuples
(or, in versions 2.4.6 and later, ResultMetadata objects) that describe the
columns in the result set.
In a tuple, the value at the index 1 (the type_code attribute In the ResultMetadata object) represents the
column data type. The Snowflake Connector for Python uses the following map to get the string representation, based on the type
code:
type_code |
String Representation |
Data Type |
|---|---|---|
0 |
FIXED |
NUMBER/INT |
1 |
REAL |
REAL |
2 |
TEXT |
VARCHAR/STRING |
3 |
DATE |
DATE |
4 |
TIMESTAMP |
TIMESTAMP |
5 |
VARIANT |
VARIANT |
6 |
TIMESTAMP_LTZ |
TIMESTAMP_LTZ |
7 |
TIMESTAMP_TZ |
TIMESTAMP_TZ |
8 |
TIMESTAMP_NTZ |
TIMESTAMP_TZ |
9 |
OBJECT |
OBJECT |
10 |
ARRAY |
ARRAY |
11 |
BINARY |
BINARY |
12 |
TIME |
TIME |
13 |
BOOLEAN |
BOOLEAN |
14 |
GEOGRAPHY |
GEOGRAPHY |
15 |
GEOMETRY |
GEOMETRY |
16 |
VECTOR |
VECTOR |
Data type mappings for qmark and numeric bindings¶
If paramstyle is either "qmark" or "numeric", the following default mappings from
Python to Snowflake data type are used:
Python Data Type |
Data Type in Snowflake |
|---|---|
|
NUMBER(38, 0) |
|
NUMBER(38, 0) |
|
NUMBER(38, <scale>) |
|
REAL |
|
TEXT |
|
TEXT |
|
BINARY |
|
BINARY |
|
BOOLEAN |
|
DATE |
|
TIME |
|
TIME |
|
TIMESTAMP_NTZ |
|
TIMESTAMP_NTZ |
If you need to map to another Snowflake type (e.g. datetime to TIMESTAMP_LTZ), specify the
Snowflake data type in a tuple consisting of the Snowflake data type followed by the value. See
Binding datetime with TIMESTAMP for examples.
Object: Exception¶
PEP-249 defines the exceptions that the Snowflake Connector for Python can raise in case of errors or warnings. The application must handle them properly and decide to continue or stop running the code.
For more information, see the PEP-249 documentation.
Methods¶
No methods are available for Exception objects.
Attributes¶
- errno¶
Snowflake DB error code.
- msg¶
Error message including error code, SQL State code and query ID.
- raw_msg¶
Error message. No error code, SQL State code or query ID is included.
- sqlstate¶
ANSI-compliant SQL State code
- sfqid
Snowflake query ID.
Object ResultBatch¶
A ResultBatch object encapsulates a function that retrieves a subset of rows in a result set. To
distribute the work of fetching results across multiple workers or nodes, you can call
get_result_batches() method in the Cursor object to retrieve a list of
ResultBatch objects and distribute these objects to different workers or nodes for processing.
Attributes¶
rowcount¶
Read-only attribute that returns the number of rows in the result batch.
compressed_size¶
Read-only attribute that returns the size of the data (when compressed) in the result batch.
uncompressed_size¶
Read-only attribute that returns the size of the data (uncompressed) in the result batch.
Methods¶
- to_arrow()¶
- Purpose:
This method returns a PyArrow table containing the rows in the
ResultBatchobject.- Parameters:
None.
- Returns:
Returns a PyArrow table containing the rows from the
ResultBatchobject.If there are no rows, this returns None.
- to_pandas()¶
- Purpose:
This method returns a pandas DataFrame containing the rows in the
ResultBatchobject.- Parameters:
None.
- Returns:
Returns a pandas DataFrame containing the rows from the
ResultBatchobject.If there are no rows, this returns an empty pandas DataFrame.
Object: ResultMetadata¶
A ResultMetadata object represents metadata about a column in the result set.
A list of these objects is returned by the description attribute and describe method of the Cursor
object.
This object was introduced in version 2.4.6 of the Snowflake Connector for Python.
Methods¶
None.
Attributes¶
- name¶
Name of the column
- display_size¶
Not used. Same as internal_size.
- internal_size¶
Internal data size.
- precision¶
Precision of numeric data.
- scale¶
Scale for numeric data.
- is_nullable¶
Trueif NULL values allowed for the column orFalse.
Module: snowflake.connector.constants¶
The snowflake.connector.constants module defines constants used in the API.
Enums¶
- class QueryStatus¶
Represents the status of an asynchronous query. This enum has the following constants:
Enum Constant
Description
RUNNING
The query is still running.
ABORTING
The query is in the process of being aborted on the server side.
SUCCESS
The query finished successfully.
FAILED_WITH_ERROR
The query finished unsuccessfully.
QUEUED
The query is queued for execution (i.e. has not yet started running), typically because it is waiting for resources.
DISCONNECTED
The session’s connection is broken. The query’s state will change to “FAILED_WITH_ERROR” soon.
RESUMING_WAREHOUSE
The warehouse is starting up and the query is not yet running.
BLOCKED
The statement is waiting on a lock held by another statement.
NO_DATA
Data about the statement is not yet available, typically because the statement has not yet started executing.
- class CertRevocationCheckMode¶
How to treat certificate revocation lists (CRLs) attached to a certificate. This enum has the following constants:
Enum Constant
Description
DISABLED
No revocation check is done.
ADVISORY
Only a revoked certificate can invalidate the chain. Errors related to the CRL don’t revoke a certificate.
ENABLED
Each certificate must have at least one valid CRL. Errors in connection, parsing, or validation of all associated CRLs revokes a certificate.
Module: snowflake.connector.pandas_tools¶
The snowflake.connector.pandas_tools module provides functions for
working with the pandas data analysis library.
For more information, see the pandas data analysis library documentation.
Functions¶
- write_pandas(parameters...)¶
- Purpose:
Writes a pandas DataFrame to a table in a Snowflake database.
To write the data to the table, the function saves the data to Parquet files, uses the PUT command to upload these files to a temporary stage, and uses the COPY INTO <table> command to copy the data from the files to the table. You can use some of the function parameters to control how the
PUTandCOPY INTO <table>statements are executed.- Parameters:
The valid input parameters are:
Parameter
Required
Description
connYes
Connectionobject that holds the connection to the Snowflake database.dfYes
pandas.DataFrameobject containing the data to be copied into the table.table_nameYes
Name of the table where the data should be copied.
databaseName of the database containing the table. By default, the function writes to the database that is currently in use in the session. Note: If you specify this parameter, you must also specify the
schemaparameter.schemaName of the schema containing the table. By default, the function writes to the table in the schema that is currently in use in the session.
bulk_upload_chunksSetting this parameter to
Truechanges the behavior of thewrite_pandasfunction to first write all the data chunks to the local disk and then perform the wildcard upload of the chunks folder to the stage. When set toFalse(default), the chunks are saved, uploaded, and deleted one by one.chunk_sizeNumber of elements to insert at a time. By default, the function inserts all elements at once in one chunk.
compressionThe compression algorithm to use for the Parquet files. You can specify either
"gzip"for better compression or"snappy"for faster compression. By default, the function uses"gzip".on_errorSpecifies how errors should be handled. Set this to one of the string values documented in the
ON_ERRORcopy option. By default, the function uses"ABORT_STATEMENT".parallelNumber of threads to use when uploading the Parquet files to the temporary stage. For the default number of threads used and guidelines on choosing the number of threads, see the parallel parameter of the PUT command.
quote_identifiersIf
False, prevents the connector from putting double quotes around identifiers before sending the identifiers to the server. By default, the connector puts double quotes around identifiers.- Returns:
Returns a tuple of
(success, num_chunks, num_rows, output)where:successisTrueif the function successfully wrote the data to the table.num_chunksis the number of chunks of data that the function copied.num_rowsis the number of rows that the function inserted.outputis the output of theCOPY INTO <table>command.
- Example:
The following example writes the data from a pandas DataFrame to the table named ‘customers’.
import pandas from snowflake.connector.pandas_tools import write_pandas # Create the connection to the Snowflake database. cnx = snowflake.connector.connect(...) # Create a DataFrame containing data about customers df = pandas.DataFrame([('Mark', 10), ('Luke', 20)], columns=['name', 'balance']) # Write the data from the DataFrame to the table named "customers". success, nchunks, nrows, _ = write_pandas(cnx, df, 'customers')
- pd_writer(parameters...)¶
- Purpose:
pd_writeris an insertion method for inserting data into a Snowflake database.When calling
pandas.DataFrame.to_sql, pass inmethod=pd_writerto specify that you want to usepd_writeras the method for inserting data. (You do not need to callpd_writerfrom your own code. Theto_sqlmethod callspd_writerand supplies the input parameters needed.)For more information see:
insertion method documentation.
pandas documentation.
Note
Please note that when column names in the pandas
DataFramecontain only lowercase letters, you must enclose the column names in double quotes; otherwise the connector raises aProgrammingError.The
snowflake-sqlalchemylibrary does not quote lowercase column names when creating a table, whilepd_writerquotes column names by default. The issue arises because the COPY INTO command expects column names to be quoted.Future improvements will be made in the
snowflake-sqlalchemylibrary.For example:
import pandas as pd from snowflake.connector.pandas_tools import pd_writer sf_connector_version_df = pd.DataFrame([('snowflake-connector-python', '1.0')], columns=['NAME', 'NEWEST_VERSION']) # Specify that the to_sql method should use the pd_writer function # to write the data from the DataFrame to the table named "driver_versions" # in the Snowflake database. sf_connector_version_df.to_sql('driver_versions', engine, index=False, method=pd_writer) # When the column names consist of only lower case letters, quote the column names sf_connector_version_df = pd.DataFrame([('snowflake-connector-python', '1.0')], columns=['"name"', '"newest_version"']) sf_connector_version_df.to_sql('driver_versions', engine, index=False, method=pd_writer)
The
pd_writerfunction uses thewrite_pandas()function to write the data in the DataFrame to the Snowflake database.- Parameters:
The valid input parameters are:
Parameter
Required
Description
tableYes
pandas.io.sql.SQLTableobject for the table.connYes
sqlalchemy.engine.Engineorsqlalchemy.engine.Connectionobject used to connect to the Snowflake database.keysYes
Names of the table columns for the data to be inserted.
data_iterYes
Iterator for the rows containing the data to be inserted.
- Example:
The following example passes
method=pd_writerto thepandas.DataFrame.to_sqlmethod, which in turn calls thepd_writerfunction to write the data in the pandas DataFrame to a Snowflake database.import pandas from snowflake.connector.pandas_tools import pd_writer # Create a DataFrame containing data about customers df = pandas.DataFrame([('Mark', 10), ('Luke', 20)], columns=['name', 'balance']) # Specify that the to_sql method should use the pd_writer function # to write the data from the DataFrame to the table named "customers" # in the Snowflake database. df.to_sql('customers', engine, index=False, method=pd_writer)
Date and timestamp support¶
Snowflake supports multiple DATE and TIMESTAMP data types, and the Snowflake Connector
allows binding native datetime and date objects for update and fetch operations.
Fetching data¶
When fetching date and time data, the Snowflake data types are converted into Python data types:
Snowflake Data Types |
Python Data Type |
Behavior |
|---|---|---|
TIMESTAMP_TZ |
Fetches data, including the time zone offset, and translates it into a |
|
TIMESTAMP_LTZ, TIMESTAMP |
Fetches data, translates it into a |
|
TIMESTAMP_NTZ |
Fetches data and translates it into a |
|
DATE |
Fetches data and translates it into a |
Note
tzinfo is a UTC offset-based time zone object and not IANA time zone
names. The time zone names might not match, but equivalent offset-based
time zone objects are considered identical.
Updating data¶
When updating date and time data, the Python data types are converted to Snowflake data types:
Python Data Type |
Snowflake Data Types |
Behavior |
|---|---|---|
datetime |
TIMESTAMP_TZ, TIMESTAMP_LTZ, TIMESTAMP_NTZ, DATE |
Converts a datetime object into a string in the format of |
struct_time |
TIMESTAMP_TZ, TIMESTAMP_LTZ, TIMESTAMP_NTZ, DATE |
Converts a struct_time object into a string in the format of |
date |
TIMESTAMP_TZ, TIMESTAMP_LTZ, TIMESTAMP_NTZ, DATE |
Converts a date object into a string in the format of |
time |
TIMESTAMP_TZ, TIMESTAMP_LTZ, TIMESTAMP_NTZ, DATE |
Converts a time object into a string in the format of |
timedelta |
TIMESTAMP_TZ, TIMESTAMP_LTZ, TIMESTAMP_NTZ, DATE |
Converts a timedelta object into a string in the format of |