Configure a catalog integration for Apache Iceberg™ REST catalogs

An Apache Iceberg™ REST catalog integration lets Snowflake access Apache Iceberg™ tables managed in a remote catalog that complies with the open source Apache Iceberg REST OpenAPI specification.

Connecting to REST catalogs

You can connect to an Iceberg REST API that uses a public endpoint or a private network.

Public endpoint

To connect to an Iceberg REST API using a public endpoint, you can create a catalog integration that uses the following authentication methods:

  • OAuth

  • Bearer token or Personal Access Token (PAT)

  • SigV4

Private network

To connect to an Iceberg REST API that’s hosted in a private network, you can create a catalog integration that uses Signature Version 4 (SigV4) authentication.

Create a catalog integration

Create a catalog integration for your chosen authentication method using the CREATE CATALOG INTEGRATION (Apache Iceberg™ REST) command. The values you specify for the REST_CONFIG and REST_AUTHENTICATION arguments differ according to the authentication method you choose.

OAuth

The following example creates a REST catalog integration that uses OAuth to connect to Tabular.

CREATE OR REPLACE CATALOG INTEGRATION tabular_catalog_int
  CATALOG_SOURCE = ICEBERG_REST
  TABLE_FORMAT = ICEBERG
  CATALOG_NAMESPACE = 'default'
  REST_CONFIG = (
    CATALOG_URI = 'https://api.tabular.io/ws'
    WAREHOUSE = '<tabular_warehouse_name>'
  )
  REST_AUTHENTICATION = (
    TYPE = OAUTH
    OAUTH_TOKEN_URI = 'https://api.tabular.io/ws/v1/oauth/tokens'
    OAUTH_CLIENT_ID = '<oauth_client_id>'
    OAUTH_CLIENT_SECRET = '<oauth_secret>'
    OAUTH_ALLOWED_SCOPES = ('catalog')
  )
  ENABLED = TRUE;
Copy

The following example creates a REST catalog integration that uses OAuth to connect to Databricks Unity Catalog.

CREATE OR REPLACE CATALOG INTEGRATION unity_catalog_int_oauth
  CATALOG_SOURCE = ICEBERG_REST
  TABLE_FORMAT = ICEBERG
  CATALOG_NAMESPACE = 'default'
  REST_CONFIG = (
    CATALOG_URI = 'https://my-api/api/2.1/unity-catalog/iceberg'
    WAREHOUSE = '<catalog_name>'
  )
  REST_AUTHENTICATION = (
    TYPE = OAUTH
    OAUTH_TOKEN_URI = 'https://my-api/oidc/v1/token'
    OAUTH_CLIENT_ID = '123AbC ...'
    OAUTH_CLIENT_SECRET = '1365910ab ...'
    OAUTH_ALLOWED_SCOPES = ('all-apis', 'sql')
  )
  ENABLED = TRUE;
Copy

Bearer token or PAT

The following example creates a REST catalog integration that uses a PAT token to connect to Databricks Unity Catalog.

CREATE OR REPLACE CATALOG INTEGRATION unity_catalog_int_pat
  CATALOG_SOURCE = ICEBERG_REST
  TABLE_FORMAT = ICEBERG
  CATALOG_NAMESPACE = 'my_namespace'
  REST_CONFIG = (
    CATALOG_URI = 'https://my-api/api/2.1/unity-catalog/iceberg'
    WAREHOUSE = '<catalog_name>'
  )
  REST_AUTHENTICATION = (
    TYPE = BEARER
    BEARER_TOKEN = 'eyAbCD...eyDeF...'
  )
  ENABLED = TRUE;
Copy

SigV4

The following diagram shows how Snowflake interacts with your REST catalog server using API Gateway and SigV4 authentication.

Diagram showing how an Iceberg REST catalog works with Amazon API Gateway, IAM, and S3.

Follow the steps in this section to use a REST API in Amazon API Gateway and Signature Version 4 (SigV4) authentication to securely connect Snowflake to an Iceberg REST catalog that isn’t publicly accessible.

  1. Create a REST API in Amazon API Gateway

  2. Create an IAM policy and attach it to a role

  3. Attach an API Gateway resource policy (private APIs only)

  4. Select IAM-based authorization for your API

  5. Retrieve the endpoint URL

  6. Create a catalog integration for SigV4

  7. Configure the trust relationship in IAM

Create a REST API in Amazon API Gateway

To connect Snowflake to your Iceberg REST catalog, you need a REST API resource in Amazon API Gateway.

If you don’t already have a REST API resource in Amazon API Gateway for your Iceberg catalog, you can create a simple REST API by modifying and importing an Iceberg catalog OpenAPI definition file or manually adding endpoints.

Note

To import the Iceberg catalog OpenAPI definition, you must modify the YAML file. Amazon API Gateway does not support all components of the OpenAPI 2.0 or 3.0 specifications. For more information, see Amazon API Gateway important notes for REST APIs.

  1. In the AWS Management Console, search for and select API Gateway.

  2. Select Create API.

  3. Select Build under REST API. To create a private REST API, select Build under REST API Private.

  4. Select one of the following options:

    • To create an API by manually adding endpoints, select New API.

    • To create an API using an OpenAPI definition file, select Import API, then upload the file or paste the definition in the code editor.

  5. Enter an API name and optional Description.

    Note

    You don’t need to enter a VPC endpoint ID when you create a private REST API.

  6. Select Create API.

For more information about creating and developing a REST API in API Gateway, see the Amazon API Gateway Developer Guide.

Create an IAM policy and attach it to a role

In this step, you create an AWS IAM role that Snowflake can use to connect to API Gateway. You attach a policy to the role that grants permission to call your API.

  1. In the AWS Management Console, search for and select IAM.

  2. From the left-hand navigation pane, select Policies.

  3. Select Create policy and then select JSON for the Policy editor.

  4. Replace the empty policy with a policy that has permission to invoke your API methods. For example, the following general policy allows the invoke action for all API Gateway resources in an AWS account.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "execute-api:Invoke"
            ],
            "Resource": "arn:aws:execute-api:*:<aws_account_id>:*"
        }
      ]
    }
    
    Copy

    Important

    As a best practice, use a policy that grants the minimum required privileges for your use case. For additional guidance and example policies, see Control access to an API with IAM permissions.

  5. Select Next.

  6. Enter a Policy name (for example, snowflake_access) and an optional Description.

  7. Select Create policy.

  8. From the left-hand navigation pane in the IAM dashboard, select Roles.

  9. Select a role to attach the policy to. When you create a catalog integration, you specify this role. If you don’t have a role, create a new role.

  10. On the role Summary page in the Permissions tab, select Add permissions » Attach policies.

  11. Search for and check the box next to the policy that you created for API Gateway, then select Add permissions.

  12. On the role Summary page, copy the role ARN. You specify this ARN when you create a catalog integration.

Attach an API Gateway resource policy (private APIs only)

If your REST API is private, you must attach an Amazon API Gateway resource policy to your API. The resource policy allows Snowflake to call your API from the Amazon Virtual Private Cloud (VPC) in which your Snowflake account is located.

  1. In Snowflake, call the SYSTEM$GET_SNOWFLAKE_PLATFORM_INFO function to retrieve the ID for the VPC in which your Snowflake account is located. Copy the VPC ID from the function output.

    SELECT SYSTEM$GET_SNOWFLAKE_PLATFORM_INFO();
    
    Copy

    Output:

    {"snowflake-vpc-id":["vpc-c1c234a5"]}
    
  2. Follow the instructions in Attaching API Gateway resource policies to attach a resource policy to your REST API.

    Paste and modify the following example policy.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Deny",
          "Principal": "*",
          "Action": "execute-api:Invoke",
          "Resource": "<api_gateway_arn>",
          "Condition": {
            "StringNotEquals": {
              "aws:sourceVpc": "<snowflake_vpc_id>"
            }
          }
        },
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:sts::123456789XXX:assumed-role/<my_api_permissions_role_name>/snowflake"
          },
          "Action": "execute-api:Invoke",
          "Resource": "<api_gateway_arn>/*/*/*",
          "Condition": {
            "StringEquals": {
              "aws:sourceVpc": "<snowflake_vpc_id>"
            }
          }
        }
      ]
    }
    
    Copy

The first statement in the policy denies all requests that don’t originate from the Snowflake VPC. The second statement allows the invoke action (for all methods) from requests originating from the Snowflake VPC that use the assumed-role session principal.

To learn more about API Gateway resource policies, see:

Select IAM-based authorization for your API

Select IAM-based authorization for each method that you want to provide access to in your REST API. With IAM-based authorization, Snowflake can use the IAM role that you configured to make calls to the API.

  1. In the Amazon API Gateway console, select your REST API.

  2. For each method:

    1. Under Resources, select a method from the list.

    2. Under Method request settings, select Edit.

    3. For Authorization, select AWS IAM.

    4. Select Save.

  3. To apply the authorization changes, select Deploy API. For more information, see Deploying a REST API from the API Gateway console.

Retrieve the endpoint URL

Retrieve your REST API endpoint URL (or invoke URL). Your API must be deployed to a stage before you can retrieve the endpoint URL.

  1. In the Amazon API Gateway console, select your REST API.

  2. In the left-hand navigation pane, select Stages.

  3. Under Stage details, copy the Invoke URL.

You specify the endpoint URL when you create a catalog integration.

Create a catalog integration for SigV4

After you have a REST API in Amazon API Gateway and have completed the initial steps to control access to your API using IAM permissions, you can create a catalog integration in Snowflake.

To view the command syntax and parameter descriptions, see CREATE CATALOG INTEGRATION (Apache Iceberg™ REST).

Public REST API

To create a catalog integration for a public REST API, specify ICEBERG_REST as the CATALOG_SOURCE and use SIGV4 authentication.

Include details such as your API endpoint URL and IAM role ARN.

CREATE OR REPLACE CATALOG INTEGRATION my_rest_catalog_integration
  CATALOG_SOURCE = ICEBERG_REST
  TABLE_FORMAT = ICEBERG
  CATALOG_NAMESPACE = 'my_namespace'
  REST_CONFIG = (
    CATALOG_URI = 'https://asdlkfjwoalk-execute-api.us-west-2-amazonaws.com/MyApiStage'
    CATALOG_API_TYPE = AWS_API_GATEWAY
  )
  REST_AUTHENTICATION = (
    TYPE = SIGV4
    SIGV4_IAM_ROLE = 'arn:aws:iam::123456789XXX:role/my_api_permissions_role'
  )
  ENABLED = TRUE;
Copy

Private REST API

To create a catalog integration for a private REST API, you must set the CATALOG_API_TYPE parameter to AWS_PRIVATE_API_GATEWAY.

CREATE OR REPLACE CATALOG INTEGRATION my_rest_catalog_integration
  CATALOG_SOURCE = ICEBERG_REST
  TABLE_FORMAT = ICEBERG
  CATALOG_NAMESPACE = 'my_namespace'
  REST_CONFIG = (
    CATALOG_URI = 'https://asdlkfjwoalk-execute-api.us-west-2-amazonaws.com/MyApiStage'
    CATALOG_API_TYPE = AWS_PRIVATE_API_GATEWAY
  )
  REST_AUTHENTICATION = (
    TYPE = SIGV4
    SIGV4_IAM_ROLE = 'arn:aws:iam::123456789XXX:role/my_api_permissions_role'
  )
  ENABLED = TRUE;
Copy

Configure the trust relationship in IAM

Next, you must retrieve information about the AWS IAM user and the external ID that were created for your Snowflake account when you created the catalog integration, and configure the trust relationship for your IAM role.

  1. In Snowflake, call the DESCRIBE CATALOG INTEGRATION command:

    DESCRIBE CATALOG INTEGRATION my_rest_catalog_integration;
    
    Copy

    Record the following values:

    Value

    Description

    API_AWS_IAM_USER_ARN

    The AWS IAM user created for your Snowflake account, for example, arn:aws:iam::123456789001:user/abc1-b-self1234. Snowflake provisions a single IAM user for your entire Snowflake account.

    API_AWS_EXTERNAL_ID

    The external ID that’s needed to establish a trust relationship.

  2. In the AWS Management Console, search for and select IAM.

  3. From the left-hand navigation pane, select Roles.

  4. Select the IAM role that you created for your catalog integration.

  5. Select the Trust relationships tab.

  6. Select Edit trust policy.

  7. Modify the policy document with the values that you recorded.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "AWS": "<api_aws_iam_user_arn>"
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "<api_aws_external_id>"
            }
          }
        }
      ]
    }
    
    Copy
  8. Select Update policy to save your changes.

Checking your REST catalog configuration

You can use the following scenarios to check whether you’ve correctly configured authorization and access control with your Iceberg REST catalog so that Snowflake can interact with your catalog server.

Check a configuration for OAuth

Follow these steps to check your configuration for OAuth with your remote REST catalog.

Step 1: Retrieve an access token

Use a curl command to retrieve an access token from your catalog. The following example requests an access token from Snowflake Open Catalog:

curl -X POST https://xx123xx.us-west-2.aws.snowflakecomputing.com/polaris/api/catalog/v1/oauth/tokens \
    -H 'Accepts: application/json' \
    -H 'Content-Type: application/x-www-form-urlencoded' \
    --data-urlencode 'grant_type=client_credentials' \
    --data-urlencode 'scope=PRINCIPAL_ROLE:ALL' \
    --data-urlencode 'client_id=<my_client_id>' \
    --data-urlencode 'client_secret=<my_client_secret>' | jq
Copy

Where:

  • https://xx123xx.us-west-2.aws.snowflakecomputing.com/polaris/api/catalog/v1/oauth/tokens is the endpoint for retrieving an OAuth token (getToken).

  • scope is the same as the value that you specify for OAUTH_ALLOWED_SCOPES parameter when you create a catalog integration. For multiple scopes, use a space as a separator.

  • my_client_id is the same client ID that you specify for the OAUTH_CLIENT_ID parameter when you create a catalog integration.

  • my_client_secret is the same client secret that you specify for the OAUTH_CLIENT_SECRET parameter when you create a catalog integration.

Example return value:

{
  "access_token": "xxxxxxxxxxxxxxxx",
  "token_type": "bearer",
  "issued_token_type": "urn:ietf:params:oauth:token-type:access_token",
  "expires_in": 3600
}

Step 2: Verify the access token permissions

Using the access token that you retrieved in the previous step, verify that you have permission to access your catalog server.

You can use a curl command to list the configuration settings for your catalog:

curl -X POST "https://xx123xx.us-west-2.aws.snowflakecomputing.com/polaris/api/catalog/v1/config?warehouse=<warehouse>" \
    -H 'Accepts: application/json' \
    -H 'Content-Type: application/x-www-form-urlencoded' \
    -H 'Authorization: Bearer ${ACCESS_TOKEN}' | jq
Copy

Where:

  • ?warehouse=warehouse optionally specifies the warehouse name to request from your catalog (if supported). For Snowflake Open Catalog, the warehouse name is your catalog name.

  • ACCESS_TOKEN is a variable that contains the access_token that you retrieved in the previous step.

Example return value:

{
  "defaults": {
    "default-base-location": "s3://my-bucket/polaris/"
  },
  "overrides": {
    "prefix": "my-catalog"
  }
}

Step 3: Load a table from the catalog

You can also make a GET request to load a table. Snowflake uses the loadTable operation to load table data from your REST catalog.

curl -X POST "https://xx123xx.us-west-2.aws.snowflakecomputing.com/polaris/api/catalog/v1/<prefix>/namespaces/<namespace>/tables/<table>" \
    -H 'Accepts: application/json' \
    -H 'Content-Type: application/x-www-form-urlencoded' \
    -H 'Authorization: Bearer ${ACCESS_TOKEN}' | jq
Copy

Where:

  • prefix optionally specifies the prefix obtained from the previous getConfig response.

  • namespace is the namespace of the table you want to retrieve. If the namespace is nested, use the %1F separator; for example, parentNamespace%1FchildNamespace.

  • table is the table name.

Check a configuration for a bearer token

Follow these steps to check your configuration with your remote REST catalog for using a bearer token.

Step 1: Verify the access token permissions

Use a curl command to verify that you have permission to access your catalog server:

curl -X POST "https://xx123xx.us-west-2.aws.snowflakecomputing.com/polaris/api/catalog/v1/config?warehouse=<warehouse>" \
    -H 'Accepts: application/json' \
    -H 'Content-Type: application/x-www-form-urlencoded' \
    -H 'Authorization: Bearer ${BEARER_TOKEN}' | jq
Copy

Where:

  • https://xx123xx.us-west-2.aws.snowflakecomputing.com/polaris/api/catalog/v1/oauth/tokens is the endpoint for retrieving an OAuth token (getToken).

  • ?warehouse=warehouse optionally specifies the warehouse name to request from your catalog (if supported).

  • BEARER_TOKEN is a variable that contains the access_token that you retrieved in the previous step.

Example return value:

{
  "defaults": {
    "default-base-location": "s3://my-bucket/polaris"
  },
  "overrides": {
    "prefix": "my-catalog"
  }
}

Step 2: Load a table from the catalog

You can also make a GET request to load a table. Snowflake uses the loadTable operation to load table data from your REST catalog.

curl -X POST "https://xx123xx.us-west-2.aws.snowflakecomputing.com/polaris/api/catalog/v1/<prefix>/namespaces/<namespace>/tables/<table>" \
    -H 'Accepts: application/json' \
    -H 'Content-Type: application/x-www-form-urlencoded' \
    -H 'Authorization: Bearer ${BEARER_TOKEN}' | jq
Copy

Where:

  • prefix optionally specifies the prefix obtained from the previous getConfig response.

  • namespace is the namespace of the table you want to retrieve. If the namespace is nested, use the %1F separator; for example, parentNamespace%1FchildNamespace.

  • table is the table name.

Check a configuration for SigV4

Follow these steps to check your configuration for SigV4 with AWS.

Step 1: Add your user to the IAM role trust relationship

When you create a REST catalog integration for SigV4, Snowflake provisions an AWS IAM user for your Snowflake account. You add that Snowflake IAM user to the trust relationship for an IAM role with permission to access your API Gateway resources.

To test your configuration, you can assume the role as a user in your AWS account after you add your AWS user to the role’s trust policy document. To retrieve your current IAM user ARN, use the sts get-caller-identity command for the AWS Command Line Interface (CLI) :

aws sts get-caller-identity
Copy

Example output:

{
  "UserId": "ABCDEFG1XXXXXXXXXXX",
  "Account": "123456789XXX",
  "Arn": "arn:aws:iam::123456789XXX:user/managed/my_user"
}

The updated trust policy document should include both the Snowflake user ARN and your user ARN as follows:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "<snowflake_iam_user_arn>",
          "<my_iam_user_arn>"
        ]
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "my_external_id"
        }
      }
    }
  ]
}
Copy

For full instructions, see Update a role trust policy in the AWS IAM documentation.

Step 2: Assume your IAM role to get temporary credentials

To get temporary security credentials for AWS, use the sts assume-role command for the AWS CLI.

aws sts assume-role \
  --role-arn <my_role_arn> \
  --role-session-name <session_name>
Copy

Where:

  • my_role_arn is the Amazon Resource Name (ARN) of the IAM role that you’ve configured for Snowflake.

  • session_name is a string identifier of your choice for the assumed role session; for example, my_rest_session.

Example output:

{
  "Credentials": {
      "AccessKeyId": "XXXXXXXXXXXXXXXXXXXXX",
      "SecretAccessKey": "XXXXXXXXXXXXXXXXXXXXX",
      "SessionToken": "XXXXXXXXXXXXXXXXXXXXX",
      "Expiration": "2024-10-09T08:13:15+00:00"
  },
  "AssumedRoleUser": {
      "AssumedRoleId": "{AccessKeyId}:my_rest_catalog_session",
      "Arn": "arn:aws:sts::123456789XXX:assumed-role/my_catalog_role/my_rest_catalog_session"
  }
}

Note

If the assume-role command fails, it means that your current AWS user isn’t included in the role’s trust policy as an allowed principal.

Similarly, if the Snowflake IAM user ARN isn’t included in your trust policy, Snowflake won’t be able to connect to your API Gateway resources. For more information, see Configure the trust relationship in IAM.

Step 3: Verify that your IAM role has the right permissions

Using the temporary credentials that you retrieved in the previous step, verify that your IAM role has permission to invoke your API Gateway APIs.

You can use a curl command to list the configuration settings for your catalog:

curl -v -X GET  "https://123xxxxxxx.execute-api.us-west-2.amazonaws.com/test_v2/v1/config?warehouse=<warehouse>" \
  --user "$AWS_ACCESS_KEY_ID":"$AWS_SECRET_ACCESS_KEY" \
  --aws-sigv4 "aws:amz:us-west-2:execute-api" \
  -H "x-amz-security-token: $AWS_SESSION_TOKEN"
Copy

Where:

  • 123xxxxxxx.execute-api.us-west-2.amazonaws.com is your API Gateway hostname.

  • test_v2 is the name of the stage that your API is deployed to.

  • v1/config specifies the getConfig operation from the Iceberg catalog OpenAPI definition.

  • ?warehouse=warehouse optionally specifies the warehouse name to request from your catalog (if supported).

  • $AWS_ACCESS_KEY_ID is a variable that contains the AccessKeyId that you retrieved using the sts assume-role command.

  • $AWS_SECRET_ACCESS_KEY is a variable that contains the SecretAccessKey that you retrieved using the sts assume-role command.

  • $AWS_SESSION_TOKEN is a variable that contains the SessionToken that you retrieved using the sts assume-role command.

Example return value:

{
  "defaults": {},
  "overrides": {
    "prefix": "my-catalog"
  }
}

You can also make a GET request to load a table. Snowflake uses the loadTable operation to load table data from your REST catalog.

curl -v -X GET "https://123xxxxxxx.execute-api.us-west-2.amazonaws.com/test_v2/v1/<prefix>/namespaces/<namespace>/tables/<table>" \
    --user "$AWS_ACCESS_KEY_ID":"$AWS_SECRET_ACCESS_KEY" \
    --aws-sigv4 "aws:amz:us-west-2:execute-api" \
    -H "x-amz-security-token: $AWS_SESSION_TOKEN"
Copy

Where:

  • prefix optionally specifies the prefix obtained from the previous getConfig response.

  • namespace is the namespace of the table you want to retrieve. If the namespace is nested, use the %1F separator; for example, parentNamespace%1FchildNamespace.

  • table is the table name.

Private API

For a private API, you can specify your VPC endpoint and private Amazon API Gateway hostname in the same curl commands.

For example:

curl -v -X GET  "https://vpce-xxxxxxxxxxxxxxxxxxxxxxxxxx.execute-api.us-west-2.vpce.amazonaws.com/test_v2/v1/config?warehouse=<warehouse>" \
  --user "$AWS_ACCESS_KEY_ID":"$AWS_SECRET_ACCESS_KEY" \
  --aws-sigv4 "aws:amz:us-west-2:execute-api" \
  -H "x-amz-security-token: $AWS_SESSION_TOKEN"
  -H "Host: abc1defgh2.execute-api.us-west-2.amazonaws.com"
Copy

Where:

  • https://vpce-xxxxxxxxxxxxxxxxxxxxxxxxxx.execute-api.us-west-2.vpce.amazonaws.com/... is the hostname of your VPC endpoint.

  • abc1defgh2.execute-api.us-west-2.amazonaws.com is the hostname of your private API in Amazon API Gateway.