Integrate customer-hosted Python artifact repositories

Customer-hosted artifact repositories connect private Python artifact repository solutions directly to Snowflake. By integrating these external repositories, you can use the same package management workflows you already apply internally.

참고

Warehouse-based Snowflake Notebooks, Streamlit, Snowflake Native Apps, and SPCS services are currently not supported.

Customer-hosted artifact repositories let you reuse the same package management and governance systems you already rely on, while making them available to Snowflake Python workloads. You can configure these repositories using API integrations and secrets, even setting them as account-wide defaults to simplify deployment.

Customer-hosted artifact repositories support PrivateLink for enhanced networking. This effectively bridges the gap between internal security standards and cloud-based data science workflows.

Key ways this integration improves security and governance include:

  • Flexibility: Snowflake Package Policy has been expanded to support all Artifact Repositories, including customer-hosted repository objects.

  • Security and Compliance: Use existing package governance and policies in customer-hosted repositories.

  • Consistency: Customers can manage Snowflake packages using the same repositories they manage other code bases.

Authentication methods

During the Private Preview, the supported authentication methods for customer-hosted artifact repositories are:

  • Username and password

  • Tokens

These credentials must be stored securely within a Snowflake SECRET object. OAuth and IAM-based authentication are not supported during the Private Preview.

Configure a customer-hosted artifact repository

To configure a customer-hosted artifact repository in Snowflake, you must create and link three primary Snowflake objects:

  • Snowflake SECRET: This object is used to securely store the repository credentials, such as a username and password or a token.

  • API integration: This object describes the network path to reach the repository, specifying whether the connection should go through the public Internet or via a PrivateLink endpoint for enhanced security.

  • Artifact repository object: This is the core object that ties together the API integration, the index URL of the repository, and the associated secret.

The following steps outline how to set this up:

  1. Create a Secret for credentials

    First, you must create a Snowflake SECRET to securely store the credentials (username/password or token) required to access your repository.

    -- Create a secret for credentials
    CREATE OR REPLACE SECRET my_repo_secret
      TYPE = PASSWORD
      USERNAME = 'your_username'
      PASSWORD = 'your_password_or_token';
    
  2. Create an API integration

    Create an API integration to describe the route to the repository. You have two options:

    • Public HTTPS: For repositories accessible over the Internet.

      CREATE OR REPLACE API INTEGRATION python_repo_integration
        API_PROVIDER = ARTIFACT_REPOSITORY_API
        API_ALLOWED_PREFIXES = ('https://nexus.example.com', 'https://artifactory.example.com')
        ALLOWED_AUTHENTICATION_SECRETS = (my_repo_secret)
        ENABLED = TRUE;
      

      Egress IP: You can securely allow ingress access from Snowflake to your package repository by allowing egress IP address ranges generated from Snowflake through the repository’s network firewall. To generate and use Snowflake egress IP addresses, follow these steps:

      참고

      Egress IP is available only for external access on AWS.

      1. Call SYSTEM$GET_SNOWFLAKE_EGRESS_IP_RANGES to get the current and upcoming IP ranges and their expiration times.

      2. Use the IP ranges you obtain to update firewall rules by using APIs, CLIs, or configuration management tools, as described in IP 주소 범위 새로 고침 자동화.

    • PrivateLink: For internal repositories, use the parameter USE_PRIVATELINK_ENDPOINT = TRUE to ensure traffic stays within a VPC/VNet.

      참고

      Private Link requires Business Critical Edition (or higher).

      • Provision a private connectivity endpoint in the Snowflake VPC or VNet to enable Snowflake to connect to your repository service. For information about how to do this, see SYSTEM$PROVISION_PRIVATELINK_ENDPOINT.

      • Use the following code to create an API integration that uses private connectivity:

      CREATE OR REPLACE API INTEGRATION python_repo_integration_pl
        API_PROVIDER = ARTIFACT_REPOSITORY_API
        API_ALLOWED_PREFIXES = ('https://nexus-pl.internal.example.com')
        USE_PRIVATELINK_ENDPOINT = TRUE
        ALLOWED_AUTHENTICATION_SECRETS = (my_repo_secret)
        ENABLED = TRUE;
      
  3. Create the Artifact Repository object

    This object ties the previous components together with your repository’s index URL.

    -- Create the artifact repository object
    CREATE OR REPLACE ARTIFACT REPOSITORY my_python_repo
      TYPE = PYPI
      API_INTEGRATION = python_repo_integration
      INDEX_URL = 'https://nexus.example.com/repository/pypi-proxy/simple/'
      AUTHENTICATION_SECRET = my_repo_secret
      COMMENT = 'Customer-hosted Python package repository (Nexus)';
    

Following is an example of a Python UDF:

CREATE OR REPLACE FUNCTION test_udf()
  RETURNS STRING
  LANGUAGE PYTHON
  RUNTIME_VERSION = 3.10
  ARTIFACT_REPOSITORY = my_python_repo
  ARTIFACT_REPOSITORY_PACKAGES = ('test_whl_package')
  HANDLER = 'test'
AS $$
import test_whl_package

def test():
  return test_whl_package.say_hello()
$$;

SELECT test_udf();

You can use customer-hosted repositories in Python stored procedures too. Note that your repository needs to host Snowpark for stored procedures to work.

CREATE OR REPLACE PROCEDURE test_sproc()
RETURNS STRING
LANGUAGE PYTHON
RUNTIME_VERSION = '3.11'
ARTIFACT_REPOSITORY = my_python_repo
ARTIFACT_REPOSITORY_PACKAGES = ('snowflake-snowpark-python', 'test_whl_package')
HANDLER = 'run'
AS $$
def run(session):
  return test_whl_package.say_hello()
$$;

CALL test_sproc();