Configure a catalog integration for Iceberg tables

This topic provides information to help you create and configure a catalog integration for Iceberg tables. To create an Iceberg table that uses an external Iceberg catalog, or no catalog at all, you must specify a catalog integration.

Note

A catalog integration is required only when you want to create a read-only Iceberg table using an external Iceberg catalog. You don’t need a catalog integration to create an Iceberg table that uses Snowflake as the Iceberg catalog. To use Snowflake as your catalog, set the CATALOG parameter to SNOWFLAKE in the CREATE ICEBERG TABLE command.

Create a catalog integration

You can create a catalog integration for Iceberg tables by using the CREATE CATALOG INTEGRATION command.

Example: Create a catalog integration for Iceberg files in object storage

The following example statement creates a catalog integration for Iceberg metadata that is in an external cloud storage location by setting OBJECT_STORE as the CATALOG_SOURCE value.

CREATE OR REPLACE CATALOG INTEGRATION icebergCatalogInt
  CATALOG_SOURCE=OBJECT_STORE
  TABLE_FORMAT=ICEBERG
  ENABLED=TRUE;
Copy

After you create a catalog integration, you can create an Iceberg table.

Configure a catalog integration for AWS Glue

This section covers how to create a catalog integration for AWS Glue and grant Snowflake restricted access to the Amazon Glue Data Catalog.

Note

To complete the instructions in this section, you must have permissions in Amazon Web Services (AWS) to create and manage IAM policies and roles. If you are not an AWS administrator, ask your AWS administrator to perform these tasks.

Step 1: Configure access permissions for the AWS Glue Data Catalog

As a best practice, create a new IAM policy for Snowflake to access the Glue Data Catalog. You can then attach the policy to an IAM role and use the security credentials that AWS generates for that role to access files in the catalog. For instructions, see Creating IAM policies and Modifying a role permissions policy in the AWS Identity and Access Management User Guide.

At a minimum, Snowflake requires the following permissions on the Glue Data Catalog to access information about tables.

  • glue:GetTable

  • glue:GetTables

The following example policy (in JSON format) provides the required permissions to access all of the tables in a specified database.

{
   "Version": "2012-10-17",
   "Statement": [
      {
         "Sid": "AllowGlueCatalogTableAccess",
         "Effect": "Allow",
         "Action": [
            "glue:GetTable",
            "glue:GetTables"
         ],
         "Resource": [
            "arn:aws:glue:*:<accountid>:table/*/*",
            "arn:aws:glue:*:<accountid>:catalog",
            "arn:aws:glue:*:<accountid>:database/<database-name>"
         ]
      }
   ]
}
Copy

Note

  • You can modify the Resource element of this policy to further restrict the allowed resources (for example, catalog, databases, or tables). For more information, see Resource types defined by AWS Glue.

  • If you use encryption for AWS Glue, you must modify the policy to add AWS Key Management Service (AWS KMS) permissions. For more information, see Setting up encryption in AWS Glue.

Step 2: Create a catalog integration in Snowflake

Create a catalog integration for the Glue Data Catalog using the CREATE CATALOG INTEGRATION command.

The following example creates a catalog integration that uses an AWS Glue catalog source. The example specifies a value for the optional GLUE_REGION parameter.

CREATE CATALOG INTEGRATION glueCatalogInt
   CATALOG_SOURCE=GLUE
   CATALOG_NAMESPACE='my.catalogdb'
   TABLE_FORMAT=ICEBERG
   GLUE_AWS_ROLE_ARN='arn:aws:iam::123456789012:role/myGlueRole'
   GLUE_CATALOG_ID='123456789012'
   GLUE_REGION='us-east-2'
   ENABLED=TRUE;
Copy

Step 3: Retrieve the AWS IAM user and external ID for your Snowflake account

To retrieve information about the AWS IAM user and the external ID that were created for your Snowflake account when you created the catalog integration, execute the DESCRIBE CATALOG INTEGRATION command. You provide this information to AWS in the next section to establish a trust relationship.

The following example command describes the catalog integration created in the previous step:

DESCRIBE CATALOG INTEGRATION glueCatalogInt;
Copy

Record the following values:

Value

Description

GLUE_AWS_IAM_USER_ARN

The AWS IAM user created for your Snowflake account, for example, arn:aws:iam::123456789001:user/abc1-b-self1234. Snowflake provisions a single IAM user for your entire Snowflake account. All Glue catalog integrations in your account use that IAM user.

GLUE_AWS_EXTERNAL_ID

The external ID that is needed to establish a trust relationship.

You will provide these values in the next section.

Step 4: Grant the IAM user permissions to access the AWS glue data catalog

Update the trust policy for the same IAM role that you specified by ARN when you created the catalog integration (GLUE_AWS_ROLE_ARN). Add the values that you recorded in Step 3: Retrieve the AWS IAM user and external ID for your Snowflake account to the trust policy.

For instructions, see Modifying a trust policy.

The following example trust policy demonstrates where to specify the GLUE_AWS_IAM_USER_ARN and GLUE_AWS_EXTERNAL_ID values:

{
   "Version": "2012-10-17",
   "Statement": [
      {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
         "AWS": "<glue_iam_user_arn>"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
         "StringEquals": {
            "sts:ExternalId": "<glue_aws_external_id>"
         }
      }
      }
   ]
}
Copy

Where:

  • glue_iam_user_arn is the GLUE_IAM_USER_ARN value that you recorded.

  • glue_aws_external_id is the GLUE_AWS_EXTERNAL_ID value that you recorded.

Note

  • For security reasons, if you create a new catalog integration (or recreate an existing catalog integration using the CREATE OR REPLACE CATALOG INTEGRATION syntax), the new catalog integration has a different external ID and cannot resolve the trust relationship unless you modify the trust policy with the new external ID.

  • To verify that your permissions are configured correctly, create an Iceberg table using this catalog integration. Snowflake doesn’t verify that your permissions are set correctly until you create an Iceberg table that references this catalog integration.

Next steps

After you configure a catalog integration for AWS Glue, you can Create an Iceberg table with AWS Glue as the catalog.

To update the table and keep it in sync with changes in AWS Glue, use an ALTER ICEBERG TABLE … REFRESH statement. For more information, see Refresh the metadata for an Iceberg table that uses AWS Glue as the catalog.