Configure an external volume for Iceberg tables

This topic provides information to help you configure an external volume for Iceberg tables. Before you can create an Iceberg table, you must have an external volume.

You can create an external volume for the following cloud storage services:

Configure an external volume for Amazon S3

This section covers how to grant Snowflake restricted access to your own Amazon S3 bucket using an external volume.

An administrator in your organization grants the IAM user permissions in your Amazon Web Services (AWS) account.

참고

  • Snowflake cannot support external volumes with bucket names that contain dots (for example, my.s3.bucket). Snowflake uses virtual-hosted-style paths and HTTPS to access data in S3. However, S3 does not support SSL for virtual-hosted-style buckets with dots in the name.

  • To complete the instructions in this section, you must have permissions in AWS to create and manage IAM policies and roles. If you are not an AWS administrator, ask your AWS administrator to perform these tasks.

  • To support data recovery, enable versioning for your external cloud storage location.

Step 1: Configure access permissions for the S3 bucket

AWS access control requirements

To access files in the folder and sub-folders, Snowflake requires the following permissions on an S3 bucket and folder:

  • s3:DeleteObject

  • s3:DeleteObjectVersion

  • s3:GetBucketLocation

  • s3:GetObject

  • s3:GetObjectVersion

  • s3:ListBucket

  • s3:PutObject

참고

The s3:PutObject permission grants write access to the external volume location. To completely configure write access, the ALLOW_WRITES parameter of the external volume must be set to TRUE (the default value).

As a best practice, Snowflake recommends that you create a designated IAM policy that grants Snowflake access to the S3 bucket. You can then attach the policy to a role and use the security credentials generated by AWS for that role to access files in the bucket.

Create an IAM policy

To configure access permissions for Snowflake in the AWS Management Console, do the following:

  1. Log in to the AWS Management Console.

  2. From the home dashboard, select Identity & Access Management (IAM):

    Identity & Access Management in AWS Management Console
  3. From the left-hand navigation pane, select Account settings.

  4. In the Security Token Service Regions list, find the Snowflake region where your account is located. If the status is Inactive, select Activate.

  5. From the left-hand navigation pane, select Policies.

  6. Select Create Policy:

    Create Policy button on Policies page
  7. Select the JSON tab.

  8. Add a policy that grants Snowflake access to the S3 bucket and folder.

    AWS policies support a variety of different security use cases. The following policy (in JSON format) provides Snowflake with the required permissions to read and write data using a single bucket and folder path.

    Copy and paste the text into the policy editor:

    참고

    • Replace bucket and prefix with your actual bucket name and folder path prefix.

    • The Amazon Resource Names (ARN) for buckets in government regions have a arn:aws-us-gov:s3::: prefix.

    • Setting the "s3:prefix": condition to either ["*"] or ["prefix/*"] grants access to all prefixes in the specified bucket or path in the bucket, respectively.

    {
       "Version": "2012-10-17",
       "Statement": [
             {
                "Effect": "Allow",
                "Action": [
                   "s3:PutObject",
                   "s3:GetObject",
                   "s3:GetObjectVersion",
                   "s3:DeleteObject",
                   "s3:DeleteObjectVersion"
                ],
                "Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
             },
             {
                "Effect": "Allow",
                "Action": [
                   "s3:ListBucket",
                   "s3:GetBucketLocation"
                ],
                "Resource": "arn:aws:s3:::<bucket>",
                "Condition": {
                   "StringLike": {
                         "s3:prefix": [
                            "<prefix>/*"
                         ]
                   }
                }
             }
       ]
    }
    
    Copy
  9. Select Review policy.

  10. Enter a policy Name (for example, snowflake_access) and an optional Description.

    Create Policy button in Review Policy page
  11. Select Create policy.

Step 2: Create an IAM role in AWS

In the AWS Management Console, create an AWS IAM role to grant privileges on the S3 bucket containing your data files.

  1. Log in to the AWS Management Console.

  2. From the home dashboard, select Identity & Access Management (IAM):

    Identity & Access Management in AWS Management Console
  3. From the left-hand navigation pane, select Roles.

  4. Select Create role.

    Select Trusted Entity Page in AWS Management Console
  5. For the trusted entity type, select Another AWS account.

  6. In the Account ID field, enter your own AWS account ID. In a later step, you modify the trusted relationship and grant access to Snowflake.

  7. Select the Require external ID option. An external ID is used to grant access to your AWS resources (such as S3 buckets) to a third party like Snowflake.

    Enter a placeholder ID such as 0000. In a later step, you modify the trust relationship for your IAM role and specify the external ID for your external volume.

  8. Select Next.

  9. Locate the policy you created in Step 1: Configure access permissions for the S3 bucket, and select this policy.

  10. Select Next.

    Review Page in AWS Management Console
  11. Enter a name and description for the role, and select Create role. You have now created an IAM policy for a bucket, created an IAM role, and attached the policy to the role.

  12. On the role summary page, locate and record the Role ARN value. You use this value in the next step to create a Snowflake external volume that references this role.

    IAM Role

Step 3: Grant privileges required for SSE-KMS encryption to the IAM role (optional)

To upload an object encrypted with an AWS KMS key to Amazon S3, the IAM role that you created in Step 2: Create an IAM role in AWS needs kms:GenerateDataKey permissions on the key. To download an object encrypted with an AWS KMS key, the IAM role needs kms:Decrypt permissions on the key.

If you want to use a KMS key for your server-side encryption, follow these steps to create a key and reference it.

  1. In the AWS Management Console, go to the KMS service. From the left navigation, select Customer managed keys, and then select Create key. You must create a key in the same region as your bucket.

  2. Create a symmetric key type. For the key usage, select Encrypt and decrypt. Select Next.

  3. In the Alias box, create a name for the key and select Next.

  4. If needed, provide an administrator for the key and select Next.

  5. In the Define key usage permissions step, enter the name of your IAM role. Select the checkbox next to the role, then select Next.

  6. Select Finish to create the key.

  7. Find the key in the list of customer managed keys, select it, and record its ARN. The following is an example of an ARN for a key: arn:aws:kms:us-west-2:111111122222:key/1a1a11aa-aa1a-aaa1a-a1a1-000000000000.

    When you create your external volume, set the KMS_KEY_ID value to the ARN of your key.

Step 4: Create an external volume in Snowflake

Create an external volume using the CREATE EXTERNAL VOLUME command.

참고

Only account administrators (users with the ACCOUNTADMIN role) can execute this SQL command.

The following example creates an external volume that defines an Amazon S3 storage location with encryption:

CREATE OR REPLACE EXTERNAL VOLUME exvol
   STORAGE_LOCATIONS =
      (
         (
            NAME = 'my-s3-us-west-2'
            STORAGE_PROVIDER = 'S3'
            STORAGE_BASE_URL = 's3://MY_EXAMPLE_BUCKET/'
            STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/myrole'
            ENCRYPTION=(TYPE='AWS_SSE_KMS' KMS_KEY_ID='arn:aws:kms:us-west-2:111111122222:key/1a1a11aa-aa1a-aaa1a-a1a1-000000000000')
         )
      );
Copy

참고

Optionally, use the STORAGE_AWS_EXTERNAL_ID parameter to specify your own external ID. You might choose this option to use the same external ID across multiple external volumes and/or storage integrations.

Step 5: Retrieve the AWS IAM user for your Snowflake account

  1. To retrieve the ARN for the AWS IAM user that was created automatically for your Snowflake account, use the DESCRIBE EXTERNAL VOLUME command. Specify the name of the external volume that you created previously.

    예:

    DESC EXTERNAL VOLUME exvol;
    
    Copy
  2. Record the values for the following properties:

    Property

    Description

    STORAGE_AWS_IAM_USER_ARN

    The AWS IAM user created for your Snowflake account; for example, arn:aws:iam::123456789001:user/abc1-b-self1234. Snowflake provisions a single IAM user for your entire Snowflake account. All S3 external volumes in your account use that IAM user.

    STORAGE_AWS_EXTERNAL_ID

    The external ID that Snowflake uses to establish a trust relationship with AWS. If you didn’t specify an external ID (STORAGE_AWS_EXTERNAL_ID) when you created the external volume, Snowflake generates an ID for you to use.

    You provide these values in the next step.

Step 6: Grant the IAM user permissions to access bucket objects

In this step, you configure permissions that allow the IAM user for your Snowflake account to access objects in your S3 bucket.

  1. Log in to the AWS Management Console.

  2. Select Identity & Access Management (IAM):

    Identity & Access Management in AWS Management Console
  3. From the left-hand navigation pane, select Roles.

  4. Select the role you created in Step 2: Create an IAM role in AWS.

  5. Select the Trust relationships tab.

  6. Select Edit trust relationship.

  7. Modify the policy document with the output values that you recorded in Step 5: Retrieve the AWS IAM user for your Snowflake account:

    Policy document for IAM role

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "AWS": "<snowflake_user_arn>"
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "<snowflake_external_id>"
            }
          }
        }
      ]
    }
    
    Copy

    여기서

    • snowflake_user_arn is the STORAGE_AWS_IAM_USER_ARN value you recorded.

    • snowflake_external_id is the STORAGE_AWS_EXTERNAL_ID value you recorded.

    참고

    You must update this policy document if you create a new external volume (or recreate an existing external volume using the CREATE OR REPLACE EXTERNAL VOLUME syntax) and do not provide your own external ID. For security reasons, a new or recreated external volume has a different external ID and cannot resolve the trust relationship unless you update this trust policy.

  8. Select Update Trust Policy. The changes are saved.

참고

To verify that your permissions are configured correctly, create an Iceberg table using this external volume. Snowflake does not verify that your permissions are set correctly until the first Iceberg table that references this external volume is created.

Next steps

After you configure an external volume, you can create an Iceberg table.

Configure an external volume for Google Cloud Storage

This section covers how to grant Snowflake restricted access to a Google Cloud Storage (GCS) bucket using an external volume.

An administrator in your organization grants the IAM user permissions in your Google Cloud account.

참고

  • To complete the instructions in this section, you must have permissions in Google Cloud to create and manage IAM policies and roles. If you are not a Google Cloud administrator, ask your Google Cloud administrator to perform these tasks.

  • To support data recovery, enable versioning for your external cloud storage location.

Step 1: Create an external volume in Snowflake

Create an external volume using the CREATE EXTERNAL VOLUME command.

참고

Only account administrators (users with the ACCOUNTADMIN role) can execute this SQL command.

The following example creates an external volume that defines a GCS storage location with encryption:

CREATE EXTERNAL VOLUME exvol
  STORAGE_LOCATIONS =
    (
      (
        NAME = 'my-us-east-1'
        STORAGE_PROVIDER = 'GCS'
        STORAGE_BASE_URL = 'gcs://mybucket1/path1/'
        ENCRYPTION=(TYPE='GCS_SSE_KMS' KMS_KEY_ID = '1234abcd-12ab-34cd-56ef-1234567890ab')
      )
    );
Copy

Step 2: Retrieve the Cloud Storage service account for your Snowflake account

To retrieve the ID for the Cloud Storage service account that was created automatically for your Snowflake account, use the DESCRIBE EXTERNAL VOLUME command. Specify the name of the external volume that you created previously.

예:

DESC EXTERNAL VOLUME exvol;
Copy

Record the value of the STORAGE_GCP_SERVICE_ACCOUNT property in the output (for example, service-account-id@project1-123456.iam.gserviceaccount.com).

Snowflake provisions a single Cloud Storage service account for your entire Snowflake account. All Google Cloud Storage external volumes use that service account.

Step 3: Grant the service account permissions to access bucket objects

In this step, you configure IAM access permissions for Snowflake in your Google Cloud Platform Console.

Create a custom IAM role

Create a custom role that has the permissions required to access the bucket and get objects.

  1. Log in to the Google Cloud Platform Console as a project editor.

  2. From the home dashboard, select IAM & admin » Roles.

  3. Select Create Role.

  4. Enter a name and description for the custom role.

  5. Select Add Permissions.

  6. Filter the list of permissions, and add the following from the list:

    • storage.buckets.get

    • storage.objects.create

    • storage.objects.delete

    • storage.objects.get

    • storage.objects.list

  7. Select Create.

Assign the custom role to the Cloud Storage service account

  1. Log in to the Google Cloud Platform Console as a project editor.

  2. From the home dashboard, select Storage » Browser:

    Bucket List in Google Cloud Platform Console
  3. Select a bucket to configure for access.

  4. Select SHOW INFO PANEL in the upper-right corner. The information panel for the bucket appears.

  5. In the Add members field, search for the service account name from the output in Step 2: Retrieve the Cloud Storage service account for your Snowflake account.

    Bucket Information Panel in Google Cloud Platform Console
  6. From the Select a role dropdown, select Storage » Custom » <role>. The <role> is the custom Cloud Storage role that you created in Create a custom IAM role.

  7. Select Add. The service account name is added to the Storage Object Viewer role dropdown in the information panel.

    Storage Object Viewer role list in Google Cloud Platform Console

참고

To verify that your permissions are configured correctly, create an Iceberg table using this external volume. Snowflake does not verify that your permissions are set correctly until the first Iceberg table that references this external volume is created.

Grant the Cloud Storage service account permissions on the cloud key management service cryptographic keys

참고

This step is required only if your GCS bucket is encrypted using a key stored in the Google Cloud Key Management Service (Cloud KMS).

  1. Log in to the Google Cloud Platform Console as a project editor.

  2. From the home dashboard, select Security » Cryptographic keys.

  3. Select the key ring that is assigned to your GCS bucket.

  4. In the upper-right corner, select SHOW INFO PANEL. The information panel for the key ring appears.

  5. In the Add members field, search for the service account name from the DESCRIBE EXTERNAL VOLUME output in Step 2: Retrieve the Cloud Storage service account for your Snowflake account.

  6. From the Select a role dropdown, select the Cloud KMS CryptoKey Encrypter/Decrypter role.

  7. Select Add. The service account name is added to the Cloud KMS CryptoKey Encrypter/Decrypter role dropdown in the information panel.

Next steps

After you configure an external volume, you can create an Iceberg table.

Configure an external volume for Azure storage

This section covers how to grant Snowflake restricted access to your own Microsoft Azure (Azure) container using an external volume. Snowflake supports the following Azure cloud storage services for external volumes:

  • Blob storage

  • Data Lake Storage Gen2

  • General-purpose v1

  • General-purpose v2

An administrator in your organization grants the IAM user permissions in your Azure account.

참고

  • Completing the instructions in this section requires permissions in Azure to create and manage IAM policies and roles. If you are not an Azure administrator, ask your Azure administrator to perform these tasks.

  • To support data recovery, enable versioning for your external cloud storage location.

Step 1: Create an external volume in Snowflake

Create an external volume using the CREATE EXTERNAL VOLUME command.

참고

Only account administrators (users with the ACCOUNTADMIN role) can execute this SQL command.

The following example creates an external volume that defines an Azure storage location with encryption:

CREATE EXTERNAL VOLUME exvol
  STORAGE_LOCATIONS =
    (
      (
        NAME = 'my-azure-northeurope'
        STORAGE_PROVIDER = 'AZURE'
        STORAGE_BASE_URL = 'azure://exampleacct.blob.core.windows.net/my_container_northeurope/'
        AZURE_TENANT_ID = 'a123b4c5-1234-123a-a12b-1a23b45678c9'
      )
    );
Copy

참고

Use the azure:// prefix and not https:// when specifying a value for STORAGE_BASE_URL.

Step 2: Grant Snowflake access to the storage location

  1. To retrieve a URL to the Microsoft permissions request page, use the DESCRIBE EXTERNAL VOLUME command. Specify the name of the external volume that you created previously.

    DESC EXTERNAL VOLUME exvol;
    
    Copy

    Record the values for the following properties:

    Property

    Description

    AZURE_CONSENT_URL

    URL to the Microsoft permissions request page.

    AZURE_MULTI_TENANT_APP_NAME

    Name of the Snowflake client application created for your account. In a later step in this section, you grant this application permission to obtain an access token on your allowed storage location.

    You use these values in the following steps.

  2. In a web browser, navigate to the Microsoft permissions request page (the AZURE_CONSENT_URL).

  3. Select Accept. This action allows the Azure service principal created for your Snowflake account to obtain an access token on any resource inside your tenant. Obtaining an access token succeeds only if you grant the service principal the appropriate permissions on the container (see the next step).

    The Microsoft permissions request page redirects to the Snowflake corporate site (snowflake.com).

  4. Log in to the Microsoft Azure portal.

  5. Go to Azure Services » Storage Accounts. Select the name of the storage account you want to grant the Snowflake service principal access to.

    참고

    You must set IAM permissions for an external volume at the storage account level, not the container level.

  6. Select Access Control (IAM) » Add role assignment.

  7. Select the Storage Blob Data Contributor role to grant read and write access to the Snowflake service principal.

    참고

    The Storage Blob Data Contributor role grants write access to the external volume location. To completely configure write access, the ALLOW_WRITES parameter of the external volume must be set to TRUE (the default value).

  8. Search for the Snowflake service principal. This is the identity in the AZURE_MULTI_TENANT_APP_NAME property in the DESC EXTERNAL VOLUME output (in Step 1). Search for the string before the underscore in the AZURE_MULTI_TENANT_APP_NAME property.

    중요

    • It can take an hour or longer for Azure to create the Snowflake service principal requested through the Microsoft request page in this section. If the service principal is not available immediately, wait an hour or two and then search again.

    • If you delete the service principal, the external volume stops working.

    Add role assignment in Azure Storage Console
  9. Select Review + assign.

    참고

    It can take up to 10 minutes for changes to take effect when you assign a role. For more information, see Symptom - Role assignment changes are not being detected in the Microsoft Azure documentation.

참고

To verify that your permissions are configured correctly, create an Iceberg table using this external volume. Snowflake does not verify that your permissions are set correctly until the first Iceberg table that references this external volume is created.

Next steps

After you configure an external volume, you can create an Iceberg table.

Enable versioning for your external cloud storage

Iceberg table data is stored in external cloud storage that you manage. If the data is in a central data repository (or data lake) that is operated on by multiple tools and services, accidental deletion or corruption might occur.

To support object recovery, you can enable versioning for your external cloud storage.

활성 저장소 위치

During the preview period, each external volume supports a single active storage location. The active location remains the same for the lifetime of the external volume.

If you specify multiple storage locations when you create an external volume, Snowflake assigns one location as the active location for the external volume.

Assignment occurs when the first table that uses the external volume is created. Snowflake uses the following logic to choose an active location:

  • If the STORAGE_LOCATIONS list contains one or more local storage locations, Snowflake uses the first local storage location in the list. A local storage location is with the same cloud provider and in the same region as your Snowflake account.

  • STORAGE_LOCATIONS 목록에 로컬 저장소 위치가 없으면 Snowflake는 목록의 첫 번째 위치를 선택합니다.

참고

  • Cross-cloud/cross-region Iceberg tables are supported only when you use a catalog integration. For more information, see Cross-cloud/cross-region support.

  • Snowflake 버전 7.44 이전에 생성된 외부 볼륨은 다른 논리를 사용하여 활성 위치를 선택했을 수 있습니다.

계정, 데이터베이스 또는 스키마 수준에서 외부 볼륨 설정하기

Iceberg 테이블에 사용할 기존 외부 볼륨을 정의하려면 다음 수준에서 EXTERNAL_VOLUME 매개 변수를 설정하면 됩니다.

계정:

계정 관리자는 ALTER ACCOUNT 명령을 사용해 계정에 대한 매개 변수를 설정할 수 있습니다. 계정에 대해 값이 설정된 경우 계정에서 생성된 모든 Iceberg 테이블은 기본적으로 이 외부 볼륨에서 읽고 볼륨에 씁니다.

오브젝트:

사용자는 적절한 CREATE <오브젝트> 또는 ALTER <오브젝트> 명령을 실행하여 데이터베이스 또는 스키마 수준에서 EXTERNAL_VOLUME 매개 변수 값을 재정의할 수 있습니다. 가장 낮은 범위가 지정된 선언이 사용됩니다(스키마 > 데이터베이스 > 계정).

적절한 ALTER <object_type> 명령을 사용하여 오브젝트를 수정하는 데 필요한 최소 권한 외에도, 역할에는 외부 볼륨에 대한 USAGE 권한이 있어야 합니다.

다음 문은 my_database_1 이라는 데이터베이스의 외부 볼륨(my_s3_vol)을 설정합니다.

ALTER DATABASE my_database_1
  SET EXTERNAL_VOLUME = 'my_s3_vol';
Copy

데이터베이스 수준에서 외부 볼륨을 설정한 후 외부 볼륨을 지정하지 않고도 해당 데이터베이스에 Iceberg 테이블을 생성할 수 있습니다. 다음 문은 Snowflake를 카탈로그로 사용하고 데이터베이스에 설정된 기본 외부 볼륨(my_s3_vol)을 사용하는 Iceberg 테이블을 my_database_1 에 생성합니다.

CREATE ICEBERG TABLE iceberg_reviews_table (
  id STRING,
  product_name STRING,
  product_id STRING,
  reviewer_name STRING,
  review_date DATE,
  review STRING
)
CATALOG = 'SNOWFLAKE'
BASE_LOCATION = 'my/product_reviews/';
Copy