Iceberg Tables¶
Iceberg tables combine features standard in Snowflake tables, such as fast SQL processing, security and authorization, and data governance with open Apache Iceberg metadata and storage.
This topic introduces key concepts and provides instructions for creating Iceberg tables.
Iceberg Tables Billing¶
Data storage usage is billed directly by your cloud storage provider. Your Snowflake account is billed for virtual warehouse (compute) usage and cloud services.
Creating External Volumes¶
To read from and write to cloud storage, Snowflake must be able to access the security and access management policies on the storage location. Delegate this access to a Snowflake identity and access management (IAM) entity. The cloud storage location and IAM entity information you designate for your Iceberg tables are stored in a Snowflake external volume object.
Complete the instructions for your cloud storage service:
Setting External Volumes¶
The EXTERNAL_VOLUME
parameter determines which existing external volume to use. This parameter can be set at the following levels:
- Account
Account administrators can use the ALTER ACCOUNT command to set the parameter for the account. If the value is set for the account, all Iceberg tables created in the account read from and write to this external volume by default.
- Object
Users can execute the appropriate CREATE <object> or ALTER <object> command to override the
EXTERNAL_VOLUME
parameter value at the database, schema, or Iceberg table level. The lowest-scoped declaration is used: table > schema > database > account.In addition to the minimum privileges required to modify an object using the appropriate ALTER <object_type_> command, a role must have the USAGE privilege on the external volume.
Examples¶
Set the parameter for your account using ALTER ACCOUNT:
ALTER ACCOUNT SET EXTERNAL_VOLUME = 'exvol1';
Set the parameter for existing database db1
using ALTER DATABASE:
ALTER DATABASE db1 SET EXTERNAL_VOLUME = 'exvol2';
Set the parameter for existing schema db1.schema1
using ALTER SCHEMA:
ALTER SCHEMA db1.schema1 SET EXTERNAL_VOLUME = 'exvol2';
For an example of setting the parameter when creating a new Iceberg table, see Creating Iceberg Tables (in this topic). Note that the external volume cannot be changed for an existing Iceberg table.
Creating Iceberg Tables¶
Create an Iceberg table using CREATE ICEBERG TABLE. The Iceberg table stores the identity of the external volume where the data and Iceberg files are located.
Note that prior to creating an Iceberg table, you must create the external volume where the Iceberg table stores data and metadata. For instructions, see Creating External Volumes (in this topic).
List all Iceberg tables in a database, schema, or across your entire account using SHOW ICEBERG TABLES. Retrieve the details for a specific Iceberg table using DESCRIBE ICEBERG TABLE.
Drop an Iceberg table using DROP ICEBERG TABLE.
Examples¶
Create a simple Iceberg table that includes two columns. This Iceberg table stores data and metadata in the example external volume created in Creating External Volumes (in this topic).
CREATE OR REPLACE ICEBERG TABLE it1 (
col1 varchar(5),
col2 varchar(20)
)
EXTERNAL_VOLUME='exvol';
Loading and Unloading Data¶
Loading data into and unloading data out of tables is currently limited to Parquet source and destination files, respectively.
Generating Snapshots of DML Changes¶
Iceberg uses a snapshot based querying model, where data files are mapped using manifest and metadata files. A snapshot represents the state of a table at a point in time and is used to access the complete set of data files in the table.
DML operations on Iceberg tables write any row inserts, updates, and deletions to the Parquet files in the external volume. This activity does not write to the Iceberg metadata files in the external volume automatically. To generate Iceberg metadata for the DML changes to the table, call the SYSTEM$GET_ICEBERG_TABLE_INFORMATION function.
The first time you call the function for a specific Iceberg table, Snowflake generates the complete set of metadata for the Iceberg table. Subsequent calls generate incremental updates to the metadata files.
The function returns the location of the metadata file and status of the snapshot generation.
For example:
SELECT SYSTEM$GET_ICEBERG_TABLE_INFORMATION('it1');
+-----------------------------------------------------------------------------------------------+
| SYSTEM$GET_ICEBERG_TABLE_INFORMATION('IT1') |
|-----------------------------------------------------------------------------------------------|
| {"metadataLocation":"s3://mybucket/metadata/v1.metadata.json","status":"success"} |
+-----------------------------------------------------------------------------------------------+
Modifying Iceberg Tables¶
There is currently no support for the ALTER ICEBERG TABLE command. To modify Iceberg tables, execute ALTER TABLE statements. All common ALTER TABLE syntax is supported.
To change the external volume used by an Iceberg table, execute the following command:
ALTER TABLE ... SET EXTERNAL_VOLUME = '<external_volume_name>'
For example, change the external volume used by Iceberg table it1
to external volume extvol1
:
ALTER TABLE it1 SET EXTERNAL_VOLUME = 'extvol1';
Iceberg Table DDL¶
To support creating and managing Iceberg tables and external volumes, Snowflake provides the following set of special DDL commands.
Streams on Iceberg Tables¶
The SQL syntax and functionality for streams on Iceberg tables is identical to streams on regular tables.
Search Optimization Service Support for Iceberg Tables¶
The search optimization service supports Iceberg tables.
Limitations¶
The following limitations currently apply to Iceberg tables and their supporting objects and files, and are subject to change:
- Access by third-party clients to Iceberg data, metadata
Third-party clients cannot read Iceberg tables without first exporting the metadata.
Third-party clients cannot append to, delete from, or upsert data to Iceberg tables.
- External volumes
You must access the cloud storage locations in external volumes using direct credentials. Storage integrations are not supported.
The trust relationship must be configured separately for each external volume created.
All Iceberg table files that rely on a specific external volume write metadata and data files to the same location in cloud storage. That is, you cannot specify a sub-folder location or partition when creating a new Iceberg table.
- Metadata files
Creating Iceberg metadata files and updating them to reflect DML changes to a table requires calling the SYSTEM$GET_ICEBERG_TABLE_INFORMATION function.
The metadata files do not identify the most recent snapshot of an Iceberg table.
- Iceberg tables
Replication does not include Iceberg tables.
Iceberg tables cannot be clustered.
Materialized views cannot query Iceberg tables.
The GET_DDL function does not return CREATE <object> statements for Iceberg tables.