CREATE ICEBERG TABLE

Creates a new Iceberg table in the current/specified schema or replaces an existing Iceberg table. Iceberg tables combine features standard in Snowflake tables, such as fast SQL processing, security and authorization, and data governance with open Apache Iceberg metadata and storage.

In addition, this command supports the following variants for Iceberg tables that use Snowflake as the catalog:

  • CREATE ICEBERG TABLE … AS SELECT (creates a populated table; also referred to as CTAS)

  • CREATE ICEBERG TABLE … LIKE (creates an empty copy of an existing table)

This topic refers to Iceberg tables as simply “tables” except where specifying Iceberg tables avoids confusion.

Note

Before creating a table, you must create the external volume where the Iceberg metadata and data files are stored. For instructions, see Configure an external volume for Iceberg tables.

If you use an external Iceberg catalog, or no catalog at all, you must also create a catalog integration for the table. To learn more, see Configure a catalog integration for Iceberg tables.

See also:

ALTER ICEBERG TABLE, DROP ICEBERG TABLE , SHOW ICEBERG TABLES , DESCRIBE ICEBERG TABLE

Syntax

Snowflake as the Iceberg catalog

-- Snowflake as the Iceberg catalog
CREATE [ OR REPLACE ] ICEBERG TABLE [ IF NOT EXISTS ] <table_name> (
    -- Column definition
    <col_name> <col_type>
      [ inlineConstraint ]
      [ NOT NULL ]
      [ COLLATE '<collation_specification>' ]
      [ { DEFAULT <expr>
          | { AUTOINCREMENT | IDENTITY }
            [ { ( <start_num> , <step_num> )
                | START <num> INCREMENT <num>
              } ]
        } ]
      [ [ WITH ] MASKING POLICY <policy_name> [ USING ( <col_name> , <cond_col1> , ... ) ] ]
      [ [ WITH ] TAG ( <tag_name> = '<tag_value>' [ , <tag_name> = '<tag_value>' , ... ] ) ]
      [ COMMENT '<string_literal>' ]

    -- Additional column definitions
    [ , <col_name> <col_type> [ ... ] ]

    -- Out-of-line constraints
    [ , outoflineConstraint [ ... ] ]
  )
  [ CLUSTER BY ( <expr> [ , <expr> , ... ] ) ]
  [ EXTERNAL_VOLUME = '<external_volume_name>' ]
  [ CATALOG = 'SNOWFLAKE' ]
  BASE_LOCATION = '<relative_path_from_external_volume>'
  [ DATA_RETENTION_TIME_IN_DAYS = <integer> ]
  [ MAX_DATA_EXTENSION_TIME_IN_DAYS = <integer> ]
  [ CHANGE_TRACKING = { TRUE | FALSE } ]
  [ DEFAULT_DDL_COLLATION = '<collation_specification>' ]
  [ COPY GRANTS ]
  [ COMMENT = '<string_literal>' ]
  [ [ WITH ] ROW ACCESS POLICY <policy_name> ON ( <col_name> [ , <col_name> ... ] ) ]
  [ [ WITH ] TAG ( <tag_name> = '<tag_value>' [ , <tag_name> = '<tag_value>' , ... ] ) ]
Copy

Where:

inlineConstraint ::=
  [ CONSTRAINT <constraint_name> ]
  { UNIQUE
    | PRIMARY KEY
    | [ FOREIGN KEY ] REFERENCES <ref_table_name> [ ( <ref_col_name> ) ]
  }
  [ <constraint_properties> ]
Copy

For additional inline constraint details, see CREATE | ALTER TABLE … CONSTRAINT.

outoflineConstraint ::=
  [ CONSTRAINT <constraint_name> ]
  { UNIQUE [ ( <col_name> [ , <col_name> , ... ] ) ]
    | PRIMARY KEY [ ( <col_name> [ , <col_name> , ... ] ) ]
    | [ FOREIGN KEY ] [ ( <col_name> [ , <col_name> , ... ] ) ]
      REFERENCES <ref_table_name> [ ( <ref_col_name> [ , <ref_col_name> , ... ] ) ]
  }
  [ <constraint_properties> ]
Copy

For additional out-of-line constraint details, see CREATE | ALTER TABLE … CONSTRAINT.

External Iceberg catalog

-- External Iceberg catalog
CREATE [ OR REPLACE ] ICEBERG TABLE [ IF NOT EXISTS ] <table_name>
  [ EXTERNAL_VOLUME = '<external_volume_name>' ]
  [ CATALOG = '<catalog_integration_name>' ]
  externalCatalogParams
  [ [ WITH ] TAG ( <tag_name> = '<tag_value>' [ , <tag_name> = '<tag_value>' , ... ] ) ]
  [ COMMENT = '<string_literal>' ]
Copy

Where:

externalCatalogParams (for AWS Glue Data Catalog) ::=
  CATALOG_TABLE_NAME = '<catalog_table_name>'
  [ CATALOG_NAMESPACE = '<catalog_namespace>' ]
Copy
externalCatalogParams (for Iceberg files in object storage) ::=
  METADATA_FILE_PATH = '<metadata_file_path>'
Copy

Variant syntax

The following variant syntax is supported for creating Iceberg tables that use Snowflake as the catalog.

CREATE ICEBERG TABLE … AS SELECT (also referred to as CTAS)

Creates a new table populated with the data returned by a query:

CREATE [ OR REPLACE ] ICEBERG TABLE <table_name> [ ( <col_name> [ <col_type> ] , <col_name> [ <col_type> ] , ... ) ]
  [ CLUSTER BY ( <expr> [ , <expr> , ... ] ) ]
  [ COPY GRANTS ]
  AS SELECT <query>
  [ ... ]
Copy

A masking policy can be applied to a column in a CTAS statement. Specify the masking policy after the column data type. Similarly, a row access policy can be applied to the table. For example:

CREATE ICEBERG TABLE <table_name> ( <col1> <data_type> [ WITH ] MASKING POLICY <policy_name> [ , ... ] )
  ...
  [ WITH ] ROW ACCESS POLICY <policy_name> ON ( <col1> [ , ... ] )
  AS SELECT <query>
  [ ... ]
Copy

Note

In a CTAS, the COPY GRANTS clause is valid only when combined with the OR REPLACE clause. COPY GRANTS copies privileges from the table being replaced with CREATE OR REPLACE (if it already exists), not from the source table(s) being queried in the SELECT statement. CTAS with COPY GRANTS lets you overwrite a table with a new set of data while keeping existing grants on that table.

For more information about COPY GRANTS, see COPY GRANTS in this document.

CREATE ICEBERG TABLE … LIKE

Creates a new table with the same column definitions as an existing table, but without copying data from the existing table. Column names, types, defaults, and constraints are copied to the new table:

CREATE [ OR REPLACE ] ICEBERG TABLE <table_name> LIKE <source_table>
  [ CLUSTER BY ( <expr> [ , <expr> , ... ] ) ]
  [ COPY GRANTS ]
  [ ... ]
Copy

For more information about COPY GRANTS, see COPY GRANTS in this document.

Note

CREATE TABLE … LIKE for a table with an auto-increment sequence accessed through a data share is currently not supported.

Required parameters

table_name

Specifies the identifier (name) for the table; must be unique for the schema in which the table is created.

In addition, the identifier must start with an alphabetic character and cannot contain spaces or special characters unless the entire identifier string is enclosed in double quotes (for example, "My object"). Identifiers enclosed in double quotes are also case-sensitive.

For more information, see Identifier requirements.

col_name

Specifies the column identifier (name). All the requirements for table identifiers also apply to column identifiers.

For more information, see Identifier requirements and Reserved & Limited Keywords.

Note

In addition to the standard reserved keywords, the following keywords cannot be used as column identifiers because they are reserved for ANSI-standard context functions:

  • CURRENT_DATE

  • CURRENT_ROLE

  • CURRENT_TIME

  • CURRENT_TIMESTAMP

  • CURRENT_USER

For the list of reserved keywords, see Reserved & Limited Keywords.

col_type

Specifies the data type for the column.

For information about the data types that can be specified for table columns, see Iceberg table data types.

query

Subquery that calls the INFER_SCHEMA function and formats the output as an array.

Optional parameters

EXTERNAL_VOLUME = 'external_volume_name'

Specifies the identifier (name) for the external volume where the Iceberg table stores its metadata files and data in Parquet format. Iceberg metadata and manifest files store the table schema, partitions, snapshots, and other metadata.

If you don’t specify this parameter, the Iceberg table defaults to the external volume for the schema, database, or account. The schema takes precedence over the database, and the database takes precedence over the account.

CONSTRAINT ...

Defines an inline or out-of-line constraint for the specified column(s) in the table.

For syntax information, see CREATE | ALTER TABLE … CONSTRAINT. For more information about constraints, see Constraints.

COLLATE 'collation_specification'

Specifies the collation to use for column operations such as string comparison. This option applies only to text columns (VARCHAR, STRING, TEXT, etc.). For more information, see Collation Specifications.

DEFAULT ... or . AUTOINCREMENT ...

Specifies whether a default value is automatically inserted in the column if a value is not explicitly specified via an INSERT or CREATE TABLE AS SELECT statement:

DEFAULT expr

Column default value is defined by the specified expression which can be any of the following:

  • Constant value.

  • Sequence reference (seq_name.NEXTVAL).

  • Simple expression that returns a scalar value.

    The simple expression can include a SQL UDF (user-defined function) if the UDF is not a secure UDF.

    Note

    If a default expression refers to a SQL UDF, then the function is replaced by its definition at table creation time. If the user-defined function is redefined in the future, this doesn’t update the column’s default expression.

    The simple expression cannot contain references to:

    • Subqueries.

    • Aggregates.

    • Window functions.

    • Secure UDFs.

    • UDFs written in languages other than SQL (for example, Java, JavaScript).

    • External functions.

{ AUTOINCREMENT | IDENTITY } [ { ( start_num , step_num ) | START num INCREMENT num } ]

AUTOINCREMENT and IDENTITY are synonymous. When either is used, the default value for the column starts with a specified number and each successive value automatically increments by the specified amount.

Caution

Snowflake uses a sequence to generate the values for an auto-incremented column. Sequences have limitations; see Sequence Semantics.

The default value for both start and step/increment is 1.

AUTOINCREMENT and IDENTITY can be used only for columns with numeric data types.

Default: No value (the column has no default value)

Note

DEFAULT and AUTOINCREMENT are mutually exclusive; only one can be specified for a column.

MASKING POLICY = policy_name

Specifies the masking policy to set on a column.

COMMENT 'string_literal'

Specifies a comment for the column.

(Note that comments can be specified at the column level or the table level. The syntax for each is slightly different.)

USING ( col_name , cond_col_1 ... )

Specifies the arguments to pass into the conditional masking policy SQL expression.

The first column in the list specifies the column for the policy conditions to mask or tokenize the data and must match the column to which the masking policy is set.

The additional columns specify the columns to evaluate to determine whether to mask or tokenize the data in each row of the query result when a query is made on the first column.

If the USING clause is omitted, Snowflake treats the conditional masking policy as a normal masking policy.

CLUSTER BY ( expr [ , expr , ... ] )

Specifies one or more columns or column expressions in the table as the clustering key. For more information, see Clustering Keys & Clustered Tables.

Default: No value (no clustering key is defined for the table)

Important

Clustering is only supported for tables that use Snowflake as the Iceberg catalog.

Clustering keys are not intended or recommended for all tables; they typically benefit very large (that is, multi-terabyte) tables.

Before you specify a clustering key for a table, you should understand micro-partitions. For more information, see Understanding Snowflake Table Structures.

DATA_RETENTION_TIME_IN_DAYS = integer

Specifies the retention period for the table so that Time Travel actions (SELECT, CLONE, UNDROP) can be performed on historical data in the table. For more information, see Understanding & Using Time Travel and Working with Temporary and Transient Tables.

For a detailed description of this object-level parameter, as well as more information about object parameters, see Parameters.

Values:

  • Standard Edition: 0 or 1

  • Enterprise Edition:

    • 0 to 90 for permanent tables

    • 0 or 1 for temporary and transient tables

Default:

  • Standard Edition: 1

  • Enterprise Edition (or higher): 1 (unless a different default value was specified at the schema, database, or account level)

Note

A value of 0 effectively disables Time Travel for the table.

MAX_DATA_EXTENSION_TIME_IN_DAYS = integer

Object parameter that specifies the maximum number of days for which Snowflake can extend the data retention period for the table to prevent streams on the table from becoming stale.

For a detailed description of this parameter, see MAX_DATA_EXTENSION_TIME_IN_DAYS.

CHANGE_TRACKING = { TRUE | FALSE }

Specifies whether to enable change tracking on the table.

  • TRUE enables change tracking on the table. This setting adds a pair of hidden columns to the source table and begins storing change tracking metadata in the columns. These columns consume a small amount of storage.

    The change tracking metadata can be queried using the CHANGES clause for SELECT statements, or by creating and querying one or more streams on the table.

  • FALSE does not enable change tracking on the table.

Default: FALSE

DEFAULT_DDL_COLLATION = 'collation_specification'

Specifies a default collation specification for the columns in the table, including columns added to the table in the future.

For more information about the parameter, see DEFAULT_DDL_COLLATION.

COPY GRANTS

Specifies to retain the access privileges from the original table when a new table is created using any of the following CREATE TABLE variants:

  • CREATE OR REPLACE TABLE

  • CREATE TABLE … LIKE

  • CREATE TABLE … CLONE

The parameter copies all privileges, except OWNERSHIP, from the existing table to the new table. The new table does not inherit any future grants defined for the object type in the schema. By default, the role that executes the CREATE TABLE statement owns the new table.

If the parameter is not included in the CREATE ICEBERG TABLE statement, then the new table does not inherit any explicit access privileges granted on the original table, but does inherit any future grants defined for the object type in the schema.

Note:

  • With data sharing:

    • If the existing table was shared to another account, the replacement table is also shared.

    • If the existing table was shared with your account as a data consumer, and access was further granted to other roles in the account (using GRANT IMPORTED PRIVILEGES on the parent database), access is also granted to the replacement table.

  • The SHOW GRANTS output for the replacement table lists the grantee for the copied privileges as the role that executed the CREATE ICEBERG TABLE statement, with the current timestamp when the statement was executed.

  • The operation to copy grants occurs atomically in the CREATE ICEBERG TABLE command (that is, within the same transaction).

COMMENT = 'string_literal'

Specifies a comment for the table.

Default: No value

(Note that comments can be specified at the column level or the table level. The syntax for each is slightly different.)

ROW ACCESS POLICY policy_name ON ( col_name [ , col_name ... ] )

Specifies the row access policy to set on a table.

TAG ( tag_name = 'tag_value' [ , tag_name = 'tag_value' , ... ] )

Specifies the tag name and the tag string value.

The tag value is always a string, and the maximum number of characters for the tag value is 256.

For information about specifying tags in a statement, see Tag quotas for objects and columns.

Snowflake catalog parameters

CATALOG = 'SNOWFLAKE'

Specifies Snowflake as the Iceberg catalog. Snowflake handles all life-cycle maintenance, such as compaction, for the table.

BASE_LOCATION = 'relative_path_from_external_volume'

Specifies a relative path from the table’s EXTERNAL_VOLUME location to a directory where Snowflake can write table data and metadata.

This parameter cannot be changed after you create a table.

External catalog parameters (externalCatalogParams)

CATALOG = 'catalog_integration_name'

Specifies the identifier (name) of the catalog integration for this table.

If not specified, the Iceberg table defaults to the catalog integration for the schema, database, or account. The schema takes precedence over the database, and the database takes precedence over the account.

AWS Glue

CATALOG_TABLE_NAME = 'catalog_table_name'

Specifies the table name as recognized by your AWS Glue Data Catalog. For an example of using CATALOG_TABLE_NAME when you create an Iceberg table, see Examples (in this topic). This parameter cannot be changed after you create the table.

CATALOG_NAMESPACE = 'catalog_namespace'

Optionally specifies the namespace (for example, my_glue_database) for the AWS Glue Data Catalog source, and overrides the default catalog namespace specified with the catalog integration. By specifying a namespace at the table level, you can use a single catalog integration for AWS Glue to create Iceberg tables across different databases.

If not specified, the table uses the default catalog namespace associated with the catalog integration.

Iceberg files in object storage

METADATA_FILE_PATH = 'metadata_file_path'

Specifies the relative path of the Iceberg metadata file to use for column definitions. For example, if s3://mybucket_us_east_1/metadata/v1.metadata.json is the full path to your metadata file, and the external volume storage location is s3://mybucket_us_east_1/, specify metadata/v1.metadata.json as the value for METADATA_FILE_PATH.

Before Snowflake version 7.34, this parameter was called METADATA_FILE_NAME.

Note

With Snowflake versions 7.34 and later, you do not specify a BASE_LOCATION to create a table from Iceberg files in object storage.

Before version 7.34, a parameter named BASE_LOCATION (also called FILE_PATH in previous versions) was required to create a table from Iceberg files in object storage. The parameter specified a relative path from the EXTERNAL_VOLUME location.

You can continue to execute a script or statement that uses the old syntax. If you do, the following notes apply:

  • The Parquet data files and Iceberg metadata files for the table must be within the BASE_LOCATION.

  • To refresh the table, you must specify a path relative to the BASE_LOCATION. For example, if the full path to your metadata file is s3://mybucket_us_east_1/my_base_location/metadata/v1.metadata.json, specify metadata/v1.metadata.json as the metadata-file-relative-path.

    For more information, see ALTER ICEBERG TABLE … REFRESH.

Access control requirements

A role used to execute this SQL command must have the following privileges at a minimum:

Privilege

Object

Notes

CREATE ICEBERG TABLE

Schema

CREATE EXTERNAL VOLUME

Account

Required to create a new external volume.

USAGE

External Volume

Required to reference an existing external volume.

CREATE INTEGRATION

Account

Required to create a new catalog integration.

USAGE

Catalog integration

Required to reference an existing catalog integration.

Note that operating on any object in a schema also requires the USAGE privilege on the parent database and schema.

For instructions on creating a custom role with a specified set of privileges, see Creating custom roles.

For general information about roles and privilege grants for performing SQL actions on securable objects, see Overview of Access Control.

Usage notes

  • A schema cannot contain tables and/or views with the same name. When creating a table:

    • If a view with the same name already exists in the schema, an error is returned and the table is not created.

    • If a table with the same name already exists in the schema, an error is returned and the table is not created, unless the optional OR REPLACE keyword is included in the command.

  • CREATE OR REPLACE <object> statements are atomic. That is, when an object is replaced, the old object is deleted and the new object is created in a single transaction.

    This means that any queries concurrent with the CREATE OR REPLACE ICEBERG TABLE operation use either the old or new table version.

  • Similar to reserved keywords, ANSI-reserved function names (CURRENT_DATE, CURRENT_TIMESTAMP, etc.) cannot be used as column names.

  • Recreating a table (using the optional OR REPLACE keyword) drops its history, which makes any stream on the table stale. A stale stream is unreadable.

  • Cross-cloud and cross-region Iceberg tables are not currently supported when you use Snowflake as the Iceberg catalog. If CREATE ICEBERG TABLE returns an error message like "External volume <volume_name> must have a STORAGE_LOCATION defined in the local region ...", make sure that your external volume uses an active storage location in the same region as your Snowflake account.

  • If you created your external volume or catalog integration using a double-quoted identifier, you must specify the identifier exactly as created (including the double quotes) in your CREATE ICEBERG TABLE statement. Failure to include the quotes might result in an Object does not exist error (or similar type of error).

    To view an example, see the Examples (in this topic) section.

  • With Snowflake versions 7.34 and later, you do not specify a BASE_LOCATION to create a table from Iceberg files in object storage.

    Before version 7.34, a parameter named BASE_LOCATION (also called FILE_PATH in previous versions) was required to create a table from Iceberg files in object storage. The parameter specified a relative path from the EXTERNAL_VOLUME location.

    You can continue to execute a script or statement that uses the old syntax. If you do, the following notes apply:

    • The Parquet data files and Iceberg metadata files for the table must be within the BASE_LOCATION.

    • To refresh the table, you must specify a path relative to the BASE_LOCATION. For example, if the full path to your metadata file is s3://mybucket_us_east_1/my_base_location/metadata/v1.metadata.json, specify metadata/v1.metadata.json as the metadata-file-relative-path.

      For more information, see ALTER ICEBERG TABLE … REFRESH.

  • CREATE TABLE … LIKE:

    • If the source table has clustering keys then the new table has clustering keys. By default, Automatic Clustering is not suspended for the new table even if Automatic Clustering was suspended for the source table.

  • CREATE TABLE … AS SELECT (CTAS):

    • When clustering keys are specified in a CTAS statement:

      • Column definitions are required and must be explicitly specified in the statement.

      • By default, Automatic Clustering is not suspended for the new table – even if Automatic Clustering is suspended for the source table.

      The ORDER BY sub-clause in a CREATE TABLE statement does not affect the order of the rows returned by future SELECT statements on that table. To specify the order of rows in future SELECT statements, use an ORDER BY sub-clause in those statements.

  • Regarding metadata:

    Attention

    Customers should ensure that no personal data (other than for a User object), sensitive data, export-controlled data, or other regulated data is entered as metadata when using the Snowflake service. For more information, see Metadata Fields in Snowflake.

Examples

Create an Iceberg table with Snowflake as the catalog

This example creates an Iceberg table with Snowflake as the Iceberg catalog. The resulting table is managed by Snowflake and supports read and write access. The statement specifies a value for the BASE_LOCATION parameter. This tells Snowflake where to write table data and metadata on the external volume.

CREATE ICEBERG TABLE my_iceberg_table (amount int)
  CATALOG='SNOWFLAKE'
  EXTERNAL_VOLUME='my_external_volume'
  BASE_LOCATION='my/relative/path/from/extvol';
Copy

Create an Iceberg table with AWS Glue as the catalog

This example creates an Iceberg table that uses the AWS Glue Data Catalog. To override the default catalog namespace and set a catalog namespace for the table, the statement uses the optional CATALOG_NAMESPACE parameter.

CREATE ICEBERG TABLE glue_iceberg_table
  EXTERNAL_VOLUME='glue_catalog_volume'
  CATALOG='glue_catalog_integration'
  CATALOG_TABLE_NAME='my_glue_catalog_table'
  CATALOG_NAMESPACE='icebergcatalogdb2';
Copy

Create an Iceberg table from Iceberg metadata in object storage

This example creates an Iceberg table from Iceberg metadata stored in external cloud storage. It also specifies a relative path to the table metadata on the external volume.

CREATE ICEBERG TABLE my_iceberg_table
  EXTERNAL_VOLUME='my_external_volume'
  CATALOG='my_catalog_integration'
  METADATA_FILE_PATH='path/to/metadata/v1.metadata.json';
Copy

Specify an external volume or catalog integration with a double-quoted identifier

This example creates an Iceberg table with an external volume and catalog integration whose identifiers contain double quotes. Identifiers enclosed in double quotes are case-sensitive and often contain special characters.

The identifiers "exvol_lower" and "catint_lower" are specified exactly as created (including the double quotes). Failure to include the quotes might result in an Object does not exist error (or similar type of error).

To learn more, see Double-quoted identifiers.

CREATE OR REPLACE ICEBERG TABLE itable_with_quoted_catalog
  EXTERNAL_VOLUME = '"exvol_lower"'
  CATALOG = '"catint_lower"'
  METADATA_FILE_PATH='path/to/metadata/v1.metadata.json';
Copy