Azure private endpoints for internal stages

This topic provide concepts as well as detailed instructions for connecting to Snowflake internal stages through Microsoft Azure Private Endpoints.

Overview

Azure Private Endpoints and Azure Private Link can be combined to provide secure connectivity to Snowflake internal stages. This setup ensures that data loading and data unloading operations to Snowflake internal stages use the Azure internal network and do not take place over the public Internet.

Prior to Microsoft supporting Private Endpoints for internal stage access, it was necessary to create a proxy farm within the Azure VNet to facilitate secure access to Snowflake internal stages. With the added support of Private Endpoints for Snowflake internal stages, users and client applications can now access Snowflake internal stages over the private Azure network. The following diagram summarizes this new support:

Connect to internal stage using Azure Private Link

Note the following regarding the numbers in the BEFORE diagram:

  • Users have two options to connect to a Snowflake internal stage:

    • Option A allows an on-premises connection directly to the internal stage as shown by the number 1.

    • Option B allows a connection to the internal stage through a proxy farm as shown by the numbers 2 and 3.

  • If using the proxy farm, users can also connect to Snowflake directly as denoted by the number 4.

Note the following regarding the numbers in the AFTER diagram:

  • For clarity, the diagram shows a single Private Endpoint from one Azure VNet pointing to a single Snowflake internal stage (6 and 7).

    Note that it is possible to configure multiple Private Endpoints, each within a different VNet, that point to the same Snowflake internal stage.

  • The updates in this feature remove the need to connect to Snowflake or a Snowflake internal stage through a proxy farm.

  • An on-premises user can connect to Snowflake directly as shown in number 5.

  • To connect to a Snowflake internal stage, on-premises user connects to a Private Endpoint, number 6, and then uses Azure Private Link to connect to the Snowflake internal stage as shown in number 7.

In Azure, each Snowflake account has a dedicated storage account to use as an internal stage. The storage account URIs are different depending on whether the connection to the storage account uses private connectivity (i.e. Azure Private Link). The private connectivity URL includes a privatelink segment in the URL.

Public storage account URI:

<storage_account_name>.blob.core.windows.net

Private connectivity storage account URI:

<storage_account_name>.privatelink.blob.core.windows.net

Benefits

Implementing Private Endpoints to access Snowflake internal stages provides the following advantages:

  • Internal stage data does not traverse the public Internet.

  • Client and SaaS applications, such as Microsoft PowerBI, that run outside of the Azure VNet can connect to Snowflake securely.

  • Administrators are not required to modify firewall settings to access internal stage data.

  • Administrators can implement consistent security and monitoring regarding how users connect to storage accounts.

Limitations

Microsoft Azure defines how a Private Endpoint can interact with Snowflake:

  • A single Private Endpoint can communicate to a single Snowflake Service Endpoint. You can have multiple one-to-one configurations that connect to the same Snowflake internal stage.

  • The maximum number of private endpoints in your storage account that can connect to a Snowflake internal stage is fixed. For details, see Standard storage account limits.

Configuring private endpoints to access Snowflake internal stages

To configure Private Endpoints to access Snowflake internal stages, it is necessary to have support from the following three roles in your organization:

  1. The Snowflake account administrator (i.e. a user with the Snowflake ACCOUNTADMIN system role).

  2. The Microsoft Azure administrator.

  3. The network administrator.

Depending on the organization, it may be necessary to coordinate the configuration efforts with more than one person or team to implement the following configuration steps.

Complete the following steps to configure and implement secure access to Snowflake internal stages through Azure Private Endpoints:

  1. Verify that your Azure subscription is registered with the Azure Storage resource manager. This step allows you to connect to the internal stage from a private endpoint.

  2. As a Snowflake account administrator, execute the following statements in your Snowflake account and record the ResourceID of the internal stage storage account defined by the privatelink_internal_stage key. For more information, see ENABLE_INTERNAL_STAGES_PRIVATELINK and SYSTEM$GET_PRIVATELINK_CONFIG.

    use role accountadmin;
    alter account set ENABLE_INTERNAL_STAGES_PRIVATELINK = true;
    select key, value from table(flatten(input=>parse_json(system$get_privatelink_config())));
    
    Copy
  3. As the Azure administrator, create a Private Endpoint through the Azure portal.

    View the Private Endpoint properties and record the resource ID value. This value will be the privateEndpointResourceID value in the next step.

    Verify that the Target sub-resource value is set to blob.

    For more information, see the Microsoft Azure Private Link documentation.

  4. As the Snowflake administrator, call the SYSTEM$AUTHORIZE_STAGE_PRIVATELINK_ACCESS function using the privateEndpointResourceID value as the function argument. This step authorizes access to the Snowflake internal stage through the Private Endpoint.

    use role accountadmin;
    select system$authorize_stage_privatelink_access('<privateEndpointResourceID>');
    
    Copy

    If necessary, complete these steps to revoke access to the internal stage.

  5. As the network administrator, update the DNS settings to resolve the URLs as follows:

    <storage_account_name>.blob.core.windows.net to <storage_account_name>.privatelink.blob.core.windows.net

    When using a private DNS zone in an Azure VNet, create the alias record for <storage_account_name>.privatelink.blob.core.windows.net.

    For more information, see Azure Private Endpoint DNS configuration.

    Tip

    • Use a separate Snowflake account for testing, and configure a private DNS zone in a test VNet to test the feature so that the testing is isolated and does not impact your other workloads.

    • If using a separate Snowflake account is not possible, use a test user to access Snowflake from a test VPC where the DNS changes are made.

    • To test from on-premises applications, use DNS forwarding to forward requests to the Azure private DNS in the VNet where the DNS settings are made. Execute the following command from the client machine to verify that the IP address returned is the private IP address for the storage account:

      dig <storage_account_name>.blob.core.windows.net
      
      Copy

Revoking Private Endpoints to access Snowflake internal stages

Complete the following steps to revoke access to Snowflake internal stages through Microsoft Azure Private Endpoints:

  1. As a Snowflake administrator, set the ENABLE_INTERNAL_STAGES_PRIVATELINK parameter to FALSE and call the SYSTEM$REVOKE_STAGE_PRIVATELINK_ACCESS function to revoke access to the Private Endpoint, using the same privateEndpointResourceID value that was used to originally authorize access to the Private Endpoint.

    use role accountadmin;
    alter account set enable_internal_stages_privatelink = false;
    select system$revoke_stage_privatelink_access('<privateEndpointResourceID>');
    
    Copy
  2. As an Azure administrator, delete the Private Endpoint through the Azure portal.

  3. As a network administrator, remove the DNS and alias records that were used to resolve the storage account URLs.

At this point, the access to the Private Endpoint is now revoked and the query result from calling the SYSTEM$GET_PRIVATELINK_CONFIG function should not return the privatelink_internal_stage key and its value.

Troubleshooting

Azure applications that access Snowflake stages over the public Internet and also use a private DNS service to resolve service hostnames cannot access Snowflake stages if a private endpoint connection is established to the stage as described in this topic.

Once a private endpoint connection is created, Microsoft Azure automatically creates a CNAME record in the public DNS service that points the storage account host to its Azure Private Link counterpart (i.e. .privatelink.blob.core.windows.net). If any application has configured a private DNS region for the same domain, then Microsoft Azure tries to resolve the storage account host by querying the private DNS service. If the entry for the storage account is not found in the private DNS service, a connection error occurs.

There are two options to address this issue:

  1. Remove or dissociate the private DNS region from the application.

  2. Create a CNAME record for the storage account private hostname (i.e. <storage_account_name>.privatelink.blob.core.windows.net) in the private DNS service and point it to the hostname specified by the output of this command:

    dig CNAME <storage_account_name>.privatelink.blob.core.windows.net
    
    Copy