Reading data from external data sources using the Snowpark Python DB-API

With the Snowpark Python DB-API, Snowpark Python users can programmatically pull data from external databases into Snowflake. The DB-API includes:

  • Python DB-API support: Connect to external databases using Python’s standard DB-API 2.0 drivers.

  • Streamlined setup: Use pip to install the necessary drivers, with no need to manage additional dependencies.

With this API, you can seamlessly pull data into Snowflake tables and transform it using Snowpark DataFrames for advanced analytics.

Using the Snowpark Python DB-API

The DB-API can be used in a similar way as the Spark JDBC API. Most parameters are designed to be identical or similar for better parity. At the same time, Snowpark emphasizes a Python-first design with intuitive naming conventions that avoid JDBC-specific configurations. This provides Python developers with a familiar experience. For more information that compares the Snowpark Python DB-API with the Spark JDBC API, see the following table:

DB-API parameters

Parameter

Snowpark Python DB-API

create_connection

Function to create a Python DB-API connection.

table

Specifies the table in the source database.

query

SQL query wrapped as a subquery for reading data.

column

Partitioning column for parallel reads.

lower_bound

Lower bound for partitioning.

upper_bound

Upper bound for partitioning.

num_partitions

Number of partitions for parallelism.

query_timeout

Timeout for SQL execution (in seconds).

fetch_size

Number of rows fetched per round trip.

custom_schema

Custom schema for pulling data from external databases.

max_workers

Number of workers for parallel fetching and pulling data from external databases.

predicates

List of conditions for WHERE clause partitions.

session_init_statement

Executes a SQL or PL/SQL statement upon session initialization.

udtf_configs

Executes the workload using a Snowflake UDTF for better performance.

fetch_merge_count

Number of fetched batches to be merged into a single Parquet file before it is uploaded.

Understanding parallelism

The Snowpark Python DB-API uses two independent forms of parallelism based on user input:

  • Partition-based parallelism

    When users specify predicates or partitioning information (for example, column, lower_bound, upper_bound, or num_partitions), Snowflake splits the query into multiple partitions. These are processed in parallel using multiprocessing, with each worker fetching and writing its own partition.

  • Fetch size–based parallelism within each partition

    Within a partition, the DB-API fetches rows in chunks defined by fetch_size. These rows are written to Snowflake in parallel as they are fetched, allowing reading and writing to overlap and maximizing throughput.

These two forms of parallelism operate independently. If neither partitioning nor fetch_size is specified, the function loads the entire source table into memory before writing to Snowflake. This can increase memory usage and reduce performance for large datasets.

SQL Server

To connect to SQL Server from Snowpark, you need the following three packages:

The following code examples show how to connect to SQL Server from a Snowpark client and a stored procedure.

Use the DB-API to connect to SQL Server from a Snowpark client

  1. Install the Python SQL Driver:

    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
      brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release
      brew update
      HOMEBREW_ACCEPT_EULA=Y brew install msodbcsql18 mssql-tools18
    
    Copy
  2. Install snowflake-snowpark-python[pandas] and pyodbc:

    pip install snowflake-snowpark-python[pandas]
    pip install pyodbc
    
    Copy
  3. Define the factory method for creating a connection to SQL Server:

    def create_sql_server_connection():
        import pyodbc
        HOST = "mssql_host"
        PORT = "mssql_port"
        USERNAME = "mssql_username"
        PASSWORD = "mssql_password"
        DATABASE = "mssql_db"
        connection_str = (
            f"DRIVER={{ODBC Driver 18 for SQL Server}};"
            f"SERVER={HOST},{PORT};"
            f"DATABASE={DATABASE};"
            f"UID={USERNAME};"
            f"PWD={PASSWORD};"
        )
        connection = pyodbc.connect(connection_str)
        return connection
    
    # Call dbapi to pull data from mssql_table
    
    df = session.read.dbapi(
      create_sql_server_connection,
        table="mssql_table")
    
    Copy

Use the DB-API to connect to SQL Server from a stored procedure

  1. Configure an external access integration (EAI), which is required to allow Snowflake to connect to the source endpoint.

    Note

    PrivateLink is recommended for secure data transfer, especially when you’re dealing with sensitive information. Ensure that your Snowflake account has the necessary PrivateLink privileges enabled and that the PrivateLink feature is configured and active in your Snowflake Notebook environment.

  2. Configure the secret, a network rule to allow egress to the source endpoint, and EAI:

    CREATE OR REPLACE SECRET mssql_secret
      TYPE = PASSWORD
      USERNAME = 'mssql_username'
      PASSWORD = 'mssql_password';
    
    -- Configure a network rule to allow egress to the source endpoint
    
    CREATE OR REPLACE NETWORK RULE mssql_network_rule
      MODE = EGRESS
      TYPE = HOST_PORT
      VALUE_LIST = ('mssql_host:mssql_port');
    
    -- Configure an external access integration
    
    CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION mssql_access_integration
      ALLOWED_NETWORK_RULES = (mssql_network_rule)
      ALLOWED_AUTHENTICATION_SECRETS = (mssql_secret)
      ENABLED = true;
    
    -- Create or replace a Python stored procedure
    
    CREATE OR REPLACE PROCEDURE sp_mssql_dbapi()
      RETURNS TABLE()
      LANGUAGE PYTHON
      RUNTIME_VERSION='3.11'
      HANDLER='run'
      PACKAGES=('snowflake-snowpark-python', 'pyodbc', 'msodbcsql')
            EXTERNAL_ACCESS_INTEGRATIONS = (mssql_access_integration)
            SECRETS = ('cred' = mssql_secret )
    
    AS $$
    
    # Get user name and password from mssql_secret
    
    import _snowflake
    username_password_object = _snowflake.get_username_password('cred')
    USER = username_password_object.username
    PASSWORD = username_password_object.password
    
    # Define a method to connect to SQL server_hostname
    from snowflake.snowpark import Session
    def create_sql_server_connection():
        import pyodbc
    
        host = "mssql_host"
        port = mssql_port
        username = USER
        password = PASSWORD
        database = "mssql_database"
        connection_str = (
          f"DRIVER={{ODBC Driver 18 for SQL Server}};"
          f"SERVER={host},{port};"
          f"DATABASE={database};"
          f"UID={username};"
          f"PWD={password};"
        )
    
        connection = pyodbc.connect(connection_str)
        return connection
    
    def run(session: Session):
        df = session.read.dbapi(
            create_sql_server_connection,
            table="mssql_table"
        )
        return df
    $$;
    
    CALL sp_mssql_dbapi();
    
    Copy

Oracle

To connect to Oracle from Snowpark, you need the following two packages:

The following code examples show how to connect to Oracle from a Snowpark client, stored procedures, and a Snowflake notebook.

Use the DB-API to connect to Oracle from a Snowpark client

  1. Install snowflake-snowpark-python[pandas] and oracledb:

    pip install snowflake-snowpark-python[pandas]
    pip install oradb
    
    Copy
  2. Use the DB-API to pull data from Oracle and define the factory method for creating a connection to Oracle:

    def create_oracle_db_connection():
        import oracledb
        HOST = "myhost"
        PORT = "myport"
        SERVICE_NAME = "myservice"
        USER = "myuser"
        PASSWORD = "mypassword"
        DSN = f"{HOST}:{PORT}/{SERVICE_NAME}"
        connection = oracledb.connect(
            user=USER,
            password=PASSWORD,
            dsn=DSN
        )
        return connection
    
    
    # Call dbapi to pull data from mytable
    
    df = session.read.dbapi(
      create_oracle_db_connection,
      table="mytable")
    
    Copy

Use the DB-API to connect to Oracle from a stored procedure

  1. Configure an external access integration (EAI), which is required to allow Snowflake to connect to the source endpoint.

    Note

    PrivateLink is recommended for secure data transfer, especially when you’re dealing with sensitive information. Ensure that your Snowflake account has the necessary PrivateLink privileges enabled and that the PrivateLink feature is configured and active in your Snowflake Notebook environment.

  2. Configure the secret, a network rule to allow egress to the source endpoint, and EAI:

    -- Configure the secret, a network rule to allow egress to the source endpoint, and EAI:
    
    CREATE OR REPLACE SECRET ora_secret
      TYPE = PASSWORD
      USERNAME = 'ora_username'
      PASSWORD = 'ora_password';
    
    -- Configure a network rule to allow egress to the source endpoint
    
    CREATE OR REPLACE NETWORK RULE ora_network_rule
      MODE = EGRESS
      TYPE = HOST_PORT
      VALUE_LIST = ('ora_host:ora_port');
    
    -- Configure an external access integration
    
    CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION ora_access_integration
      ALLOWED_NETWORK_RULES = (ora_network_rule)
      ALLOWED_AUTHENTICATION_SECRETS = (ora_secret)
      ENABLED = true;
    
    Copy

Using the DB-API to pull data from Oracle in a Python stored procedure

CREATE OR REPLACE PROCEDURE sp_ora_dbapi()
  RETURNS TABLE()
  LANGUAGE PYTHON
  RUNTIME_VERSION='3.11'
  HANDLER='run'
  PACKAGES=('snowflake-snowpark-python', 'oracledb')
  EXTERNAL_ACCESS_INTEGRATIONS = (ora_access_integration)
  SECRETS = ('cred' = ora_secret )
AS $$

# Get user name and password from ora_secret
import _snowflake
username_password_object = _snowflake.get_username_password('cred')
USER = username_password_object.username
PASSWORD = username_password_object.password

# Define the factory method for creating a connection to Oracle

from snowflake.snowpark import Session

def create_oracle_db_connection():
    import oracledb
    host = "ora_host"
    port = "ora_port"
    service_name = "ora_service"
    user = USER
    password = PASSWORD
    DSN = f"{host}:{port}/{service_name}"
    connection = oracledb.connect(
        user=USER,
        password=PASSWORD,
        dsn=DSN
    )
    return connection

def run(session: Session):
    df = session.read.dbapi(
        create_ora_connection,
        table="ora_table"
    )
    return df
$$;

CALL sp_ora_dbapi();
Copy

Use the DB-API to connect to Oracle from a Snowflake notebook

  1. From Snowflake Notebook packages, select snowflake-snowpark-python and oracledb.

  2. Configure the secret, a network rule to allow egress to the source endpoint, and EAI:

    CREATE OR REPLACE SECRET ora_secret
      TYPE = PASSWORD
      USERNAME = 'ora_username'
      PASSWORD = 'ora_password';
    
    ALTER NOTEBOOK mynotebook SET SECRETS = ('snowflake-secret-object' = ora_secret);
    
    -- Configure a network rule to allow egress to the source endpoint
    
    CREATE OR REPLACE NETWORK RULE ora_network_rule
      MODE = EGRESS
      TYPE = HOST_PORT
      VALUE_LIST = ('ora_host:ora_port');
    
    -- Configure an external access integration.
    
    CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION ora_access_integration
      ALLOWED_NETWORK_RULES = (ora_network_rule)
      ALLOWED_AUTHENTICATION_SECRETS = (ora_secret)
      ENABLED = true;
    
    Copy
  3. Set up external access for Snowflake Notebooks, and then restart the notebook session.

Using the DB-API to pull data from Oracle in a Python cell of a Snowflake notebook

# Get user name and password from ora_secret

import _snowflake
username_password_object = _snowflake.get_username_password('snowflake-secret-object')
USER = username_password_object.username
PASSWORD = username_password_object.password

import snowflake.snowpark.context
session = snowflake.snowpark.context.get_active_session()

# Define the factory method for creating a connection to Oracle

def create_oracle_db_connection():
    import oracledb
    host = "ora_host"
    port = "ora_port"
    service_name = "ora_service"
    user = USER
    password = PASSWORD
    DSN = f"{host}:{port}/{service_name}"
    connection = oracledb.connect(
        user=USER,
        password=PASSWORD,
        dsn=DSN
    )
    return connection

# Use dbapi to read data from ora_table

df_ora = session.read.dbapi(
  create_oracle_db_connection,
  table='ora_table'
)

# Save data into sf_table

df_ora.write.mode("overwrite").save_as_table('sf_table')
Copy

PostgreSQL

To connect to PostgreSQL from Snowpark, you need the following two packages:

The following code examples show how to connect to PostgreSQL from a Snowpark client, stored procedures, and a Snowflake notebook.

Use the DB-API to connect to PostgreSQL from a Snowpark client

  1. Install psycopg2:

    pip install psycopg2
    
    Copy
  2. Define the factory method for creating a connection to PostgreSQL:

    def create_pg_connection():
        import psycopg2
        connection = psycopg2.connect(
            host="pg_host",
            port=pg_port,
            dbname="pg_dbname",
            user="pg_user",
            password="pg_password",
        )
        return connection
    
    # Call dbapi to pull data from pg_table
    
    df = session.read.dbapi(
      create_pg_connection,
      table="pg_table")
    
    Copy

Use the DB-API to connect to PostgreSQL from a stored procedure

  1. Configure an external access integration (EAI), which is required to allow Snowflake to connect to the source endpoint.

    Note

    PrivateLink is recommended for secure data transfer, especially when you’re dealing with sensitive information. Ensure that your Snowflake account has the necessary PrivateLink privileges enabled and that the PrivateLink feature is configured and active in your Snowflake Notebook environment.

  2. Configure the secret, a network rule to allow egress to the source endpoint, and EAI:

    CREATE OR REPLACE SECRET pg_secret
      TYPE = PASSWORD
      USERNAME = 'pg_username'
      PASSWORD = 'pg_password';
    
    -- Configure a network rule.
    
    CREATE OR REPLACE NETWORK RULE pg_network_rule
     MODE = EGRESS
     TYPE = HOST_PORT
     VALUE_LIST = ('pg_host:pg_port');
    
    -- Configure an external access integration.
    
    CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION pg_access_integration
      ALLOWED_NETWORK_RULES = (pg_network_rule)
      ALLOWED_AUTHENTICATION_SECRETS = (pg_secret)
      ENABLED = true;
    
    Copy
  3. Use the DB-API to pull data from PostgreSQL in a Python stored procedure:

    CREATE OR REPLACE PROCEDURE sp_pg_dbapi()
    RETURNS TABLE()
    LANGUAGE PYTHON
    RUNTIME_VERSION='3.11'
    HANDLER='run'
    PACKAGES=('snowflake-snowpark-python', 'psycopg2')
    EXTERNAL_ACCESS_INTEGRATIONS = (pg_access_integration)
    SECRETS = ('cred' = pg_secret )
    AS $$
    
    # Get user name and password from pg_secret
    
    import _snowflake
    username_password_object = _snowflake.get_username_password('cred')
    USER = username_password_object.username
    PASSWORD = username_password_object.password
    
    # Define the factory method for creating a connection to PostgreSQL
    
    from snowflake.snowpark import Session
    
    def create_pg_connection():
        import psycopg2
        connection = psycopg2.connect(
            host="pg_host",
            port=pg_port,
            dbname="pg_dbname",
            user=USER,
            password=PASSWORD,
        )
        return connection
    
    def run(session: Session):
        df = session.read.dbapi(
            create_pg_connection,
            table="pg_table"
        )
        return df
    
    $$;
    CALL sp_pg_dbapi();
    
    Copy

Use the DB-API to connect to PostgreSQL from a Snowflake notebook

  1. From Snowflake Notebook packages, select snowflake-snowpark-python and psycopg2.

  2. Configure an external access integration (EAI), which is required to allow Snowflake to connect to the source endpoint.

    Note

    PrivateLink is recommended for secure data transfer, especially when you’re dealing with sensitive information. Ensure that your Snowflake account has the necessary PrivateLink privileges enabled and that the PrivateLink feature is configured and active in your Snowflake Notebook environment.

  3. Configure the secret, a network rule to allow egress to the source endpoint, and EAI:

    -- Configure the secret
    
    CREATE OR REPLACE SECRET pg_secret
      TYPE = PASSWORD
      USERNAME = 'pg_username'
      PASSWORD = 'pg_password';
    
    ALTER NOTEBOOK pg_notebook SET SECRETS = ('snowflake-secret-object' = pg_secret);
    
    -- Configure the network rule to allow egress to the source endpoint
    
    CREATE OR REPLACE NETWORK RULE pg_network_rule
      MODE = EGRESS
      TYPE = HOST_PORT
      VALUE_LIST = ('pg_host:pg_port');
    
    -- Configure external access integration
    
    CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION pg_access_integration
      ALLOWED_NETWORK_RULES = (pg_network_rule)
      ALLOWED_AUTHENTICATION_SECRETS = (pg_secret)
      ENABLED = true;
    
    Copy
  4. Set up external access for Snowflake Notebooks:

    # Get the user name and password from :code:`pg_secret`
    
    import _snowflake
    username_password_object = _snowflake.get_username_password('snowflake-secret-object')
    USER = username_password_object.username
    PASSWORD = username_password_object.password
    
    import snowflake.snowpark.context
    session = snowflake.snowpark.context.get_active_session()
    
    # Define the factory method for creating a connection to PostgreSQL
    
    def create_pg_connection():
        import psycopg2
        connection = psycopg2.connect(
            host="pg_host",
            port=pg_port,
            dbname="pg_dbname",
            user=USER,
            password=PASSWORD,
        )
        return connection
    
    # Use dbapi to read and save data from pg_table
    
    df = session.read.dbapi(
        create_pg_connection,
        table="pg_table"
    )
    
    # Save data into sf_table
    
    df.write.mode("overwrite").save_as_table('sf_table')
    
    Copy
  5. Restart the notebook session.

MySQL

To connect to MySQL from Snowpark, you need the following two packages:

The following code examples show how to connect to MySQL from a Snowpark client, stored procedures, and a Snowflake notebook.

Use the DB-API to connect to MySQL from a Snowpark client

  1. Install pymysql:

    pip install snowflake-snowpark-python[pandas]
    pip install pymysql
    
    Copy
  2. Define the factory method for creating a connection to MySQL:

    def create_mysql_connection():
        import pymysql
        connection = pymysql.connect(
            host="mysql_host",
            port=mysql_port,
            database="mysql_db",
            user="mysql_user",
            password="mysql_password"
        )
        return connection
    
    
    # Call dbapi to pull data from mysql_table
    
    df = session.read.dbapi(
        create_mysql_connection,
        table="mysql_table"
    )
    
    Copy

Use the DB-API to connect to MySQL from a stored procedure

  1. Configure an external access integration (EAI), which is required to allow Snowflake to connect to the source endpoint.

    Note

    PrivateLink is recommended for secure data transfer, especially when you’re dealing with sensitive information. Ensure that your Snowflake account has the necessary PrivateLink privileges enabled and that the PrivateLink feature is configured and active in your Snowflake Notebook environment.

  2. Configure the secret, a network rule to allow egress to the source endpoint, and EAI:

    CREATE OR REPLACE SECRET mysql_secret
      TYPE = PASSWORD
      USERNAME = 'mysql_username'
      PASSWORD = 'mysql_password';
    
    -- Configure a network rule.
    
    CREATE OR REPLACE NETWORK RULE mysql_network_rule
      MODE = EGRESS
      TYPE = HOST_PORT
      VALUE_LIST = ('mysql_host:mysql_port');
    
    -- Configure an external access integration
    
    CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION mysql_access_integration
      ALLOWED_NETWORK_RULES = (mysql_network_rule)
      ALLOWED_AUTHENTICATION_SECRETS = (mysql_secret)
      ENABLED = true;
    
    Copy
  3. Use the Snowpark Python DB-API to pull data from MySQL in a Python stored procedure:

    CREATE OR REPLACE PROCEDURE sp_mysql_dbapi()
    RETURNS TABLE()
    LANGUAGE PYTHON
    RUNTIME_VERSION='3.11'
    HANDLER='run'
    PACKAGES=('snowflake-snowpark-python', 'pymysql')
    EXTERNAL_ACCESS_INTEGRATIONS = (mysql_access_integration)
    SECRETS = ('cred' = mysql_secret )
    AS $$
    
    # Get user name and password from mysql_secret
    
    import _snowflake
        username_password_object = _snowflake.get_username_password('cred')
        USER = username_password_object.username
        PASSWORD = username_password_object.password
    
    # Define the factory method for creating a connection to MySQL
    
    from snowflake.snowpark import session
    
    def create_mysql_connection():
        import pymysql
        connection = pymysql.connect(
            host="mysql_host",
            port=mysql_port,
            dbname="mysql_dbname",
            user=USER,
            password=PASSWORD,
        )
        return connection
    
    def run(session: Session):
        df = session.read.dbapi(
            create_mysql_connection,
            table="mysql_table"
        )
        return df
    $$;
    
    CALL sp_mysql_dbapi();
    
    Copy

Use the DB-API to connect to MySQL from a Snowflake notebook

  1. From Snowflake Notebook packages, select snowflake-snowpark-python and pymysql.

  2. Configure an external access integration (EAI), which is required to allow Snowflake to connect to the source endpoint.

    Note

    PrivateLink is recommended for secure data transfer, especially when you’re dealing with sensitive information. Ensure that your Snowflake account has the necessary PrivateLink privileges enabled and that the PrivateLink feature is configured and active in your Snowflake Notebook environment.

  3. Configure the secret, and add it to the Snowflake notebook:

    CREATE OR REPLACE SECRET mysql_secret
      TYPE = PASSWORD
      USERNAME = 'mysql_username'
      PASSWORD = 'mysql_password';
    
    ALTER NOTEBOOK mynotebook SET SECRETS = ('snowflake-secret-object' = mysql_secret);
    
    Copy
  4. Configure a network rule and an EAI:

    CREATE OR REPLACE NETWORK RULE mysql_network_rule
      MODE = EGRESS
      TYPE = HOST_PORT
      VALUE_LIST = ('mysql_host:mysql_port');
    
    CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION mysql_access_integration
      ALLOWED_NETWORK_RULES = (mysql_network_rule)
      ALLOWED_AUTHENTICATION_SECRETS = (mysql_secret)
      ENABLED = true;
    
    Copy
  5. Set up external access for Snowflake Notebooks, and then restart the notebook session:

    # Get user name and password from mysql_secret
    import _snowflake
    username_password_object = _snowflake.get_username_password('snowflake-secret-object')
    USER = username_password_object.username
    PASSWORD = username_password_object.password
    
    import snowflake.snowpark.context
    session = snowflake.snowpark.context.get_active_session()
    
    # Define the factory method for creating a connection to MySQL
    
    def create_mysql_connection():
        import pymysql
        connection = pymysql.connect(
            host="mysql_host",
            port=mysql_port,
            dbname="mysql_dbname",
            user=USER,
            password=PASSWORD,
        )
        return connection
    
    # Call dbapi to pull data from mysql_table
    
    df = session.read.dbapi(
        create_mysql_connection,
        table="mysql_table")
    
    # Save data into sf_table
    
    df.write.mode("overwrite").save_as_table('sf_table')
    
    Copy

Databricks

To connect to Databricks from Snowpark, you need the following two packages:

The following code examples show how to connect to Databricks from a Snowpark client, stored procedures, and a Snowflake Notebook.

Use the DB-API to connect to Databricks from a Snowpark client

  1. Install databricks-sql-connector:

    pip install snowflake-snowpark-python[pandas]
    pip install databricks-sql-connector
    
    Copy
  2. Define the factory method for creating a connection to Databricks:

    def create_dbx_connection():
        import databricks.sql
        connection = databricks.sql.connect(
            server_hostname=HOST,
            http_path=PATH,
            access_token=ACCESS_TOKEN
        )
        return connection
    
    #Call dbapi to pull data from mytable
    
    df = session.read.dbapi(
        create_dbx_connection,
        table="dbx_table")
    
    Copy

Use the DB-API to connect to Databricks from a stored procedure

  1. From Snowflake Notebook packages, select snowflake-snowpark-python and pymysql.

  2. Configure an external access integration (EAI), which is required to allow Snowflake to connect to the source endpoint.

    Note

    PrivateLink is recommended for secure data transfer, especially when you’re dealing with sensitive information. Ensure that your Snowflake account has the necessary PrivateLink privileges enabled and that the PrivateLink feature is configured and active in your Snowflake Notebook environment.

  3. Configure the secret, a network rule to allow egress to the source endpoint, and EAI:

    CREATE OR REPLACE SECRET dbx_secret
      TYPE = GENERIC_STRING
      SECRET_STRING = 'dbx_access_token';
    
    -- Configure a network rule
    
    CREATE OR REPLACE NETWORK RULE dbx_network_rule
      MODE = EGRESS
      TYPE = HOST_PORT
      VALUE_LIST = ('dbx_host:dbx_port');
    
    -- Configure an external access integration
    
    CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION dbx_access_integration
      ALLOWED_NETWORK_RULES = (dbx_network_rule)
      ALLOWED_AUTHENTICATION_SECRETS = (dbx_secret)
      ENABLED = true;
    
    Copy
  4. Use the Snowpark Python DB-API to pull data from Databricks in a Python stored procedure:

    CREATE OR REPLACE PROCEDURE sp_dbx_dbapi()
      RETURNS TABLE()
      LANGUAGE PYTHON
      RUNTIME_VERSION='3.11'
      HANDLER='run'
      PACKAGES=('snowflake-snowpark-python', 'databricks-sql-connector')
      EXTERNAL_ACCESS_INTEGRATIONS = (dbx_access_integration)
      SECRETS = ('cred' = dbx_secret )
    AS $$
    
    # Get user name and password from dbx_secret
    
    import _snowflake
    ACCESS_TOKEN = _snowflake.get_generic_secret_string('cred')
    
    from snowflake.snowpark import Session
    
    # define the method for creating a connection to Databricks
    def create_dbx_connection():
        import databricks.sql
        connection = databricks.sql.connect(
            server_hostname="dbx_host",
            http_path="dbx_path",
            access_token=ACCESS_TOKEN,
        )
        return connection
    
    def run(session: Session):
        df = session.read.dbapi(
           create_dbx_connection,
           table="dbx_table"
        )
       return df
    
    $$;
    
    CALL sp_dbx_dbapi();
    
    Copy

Use the DB-API to connect to Databricks from a Snowflake notebook

  1. From Snowflake Notebook packages, select snowflake-snowpark-python and pymysql.

  2. Configure an external access integration (EAI), which is required to allow Snowflake to connect to the source endpoint.

    Note

    PrivateLink is recommended for secure data transfer, especially when you’re dealing with sensitive information. Ensure that your Snowflake account has the necessary PrivateLink privileges enabled and that the PrivateLink feature is configured and active in your Snowflake Notebook environment.

  3. Configure the secret, and add it to the Snowflake notebook:

    CREATE OR REPLACE SECRET dbx_secret
      TYPE = GENERIC_STRING
      SECRET_STRING = 'dbx_access_token';
    
    ALTER NOTEBOOK mynotebook SET SECRETS = ('snowflake-secret-object' = dbx_secret);
    
    Copy
  4. Configure a network to allow egress to the source endpoint and EAI:

    CREATE OR REPLACE NETWORK RULE dbx_network_rule
      MODE = EGRESS
      TYPE = HOST_PORT
      VALUE_LIST = ('dbx_host:dbx_port');
    
    CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION dbx_access_integration
      ALLOWED_NETWORK_RULES = (dbx_network_rule)
      ALLOWED_AUTHENTICATION_SECRETS = (dbx_secret)
      ENABLED = true;
    
    Copy
  5. Set up external access for Snowflake Notebooks, and then restart the notebook session:

    # Get user name and password from dbx_secret
    
    import _snowflake
    ACCESS_TOKEN = _snowflake.get_generic_secret_string('cred')
    
    import snowflake.snowpark.context
    session = snowflake.snowpark.context.get_active_session()
    
    # Define the factory method for creating a connection to Databricks
    
    def create_dbx_connection():
        import databricks.sql
        connection = databricks.sql.connect(
            server_hostname="dbx_host",
            http_path="dbx_path",
            access_token=ACCESS_TOKEN,
        )
        return connection
    
    # use dbapi to read data from dbx_table
    
    df = session.read.dbapi(
        create_dbx_connection,
        table="dbx_table"
    )
    
    # save data into sf_table
    
    df.write.mode("overwrite").save_as_table('sf_table')
    
    Copy

Limitations

The Snowpark Python DB-API supports only Python DB-API 2.0–compliant drivers (for example, pyodbc or oracledb). JDBC drivers are not supported in this release.