Snowpipe Streaming high-performance architecture with Apache Iceberg™ tables¶
Snowpipe Streaming with high-performance architecture supports ingesting data into Snowflake-managed Apache Iceberg tables, including both Iceberg v2 and Iceberg v3 tables. This enables near real-time streaming of data into Iceberg tables with all the performance benefits of the high-performance architecture.
Note
The classic architecture supports Iceberg v2 tables only. If you need Iceberg v3 support, you must use the high-performance architecture. For more information about Iceberg support in the classic architecture, see Snowpipe Streaming Classic with Apache Iceberg™ tables.
How it works¶
Snowpipe Streaming ingests data through the PIPE object into your target Iceberg table. Snowflake creates Iceberg-compatible Apache Parquet data files with corresponding Iceberg metadata, and uploads them to your configured external cloud storage location. The data is made available as a Snowflake-managed Iceberg table registered with Snowflake as the Iceberg catalog.
Snowflake connects to your storage location using an external volume.
Get started¶
This section provides a step-by-step example of how to set up Snowpipe Streaming with high-performance architecture to ingest data into an Iceberg table.
Step 1: Create an external volume¶
Create an external volume that specifies a storage location for your Iceberg table data.
Step 2: Create a Snowflake-managed Iceberg table¶
Create a Snowflake-managed Iceberg table with your configured external volume:
Step 3: Create a pipe for ingestion¶
Create a pipe that targets the Iceberg table. You can use the default pipe (automatically created) or create a custom pipe:
Step 4: Stream data using the SDK¶
Configure the SDK to stream data into your Iceberg table through the pipe. Use the same SDK setup as described in Tutorial: Get started with Snowpipe Streaming high-performance architecture SDK, specifying your Iceberg table’s pipe in the client configuration.
Supported Iceberg versions¶
The high-performance architecture supports both Iceberg v2 and Iceberg v3 tables.
The classic architecture supports only Iceberg v2 tables.
Supported data types¶
The Snowflake Ingest SDK supports most of the Iceberg data types that Snowflake currently supports. For more information, see Data types for Apache Iceberg™ tables.
The SDK also supports ingestion into the three structured data types: Structured ARRAY, Structured OBJECT, and Structured MAP.
Usage notes¶
Snowpipe Streaming only supports Snowflake as the Iceberg catalog. Externally managed Iceberg tables that use external catalogs (such as AWS Glue or Hive Metastore) aren’t supported. However, you can sync your Snowflake-managed Iceberg tables with Snowflake Open Catalog.
Snowflake connects to your storage location using an external volume. You are responsible for data storage for Iceberg tables.
The Iceberg-compatible Parquet files are created based on the STORAGE_SERIALIZATION_POLICY specified on the Iceberg table.
Limitations¶
The following limitations apply to Snowpipe Streaming with high-performance architecture and Iceberg tables:
Partitioned Iceberg tables aren’t supported.
Schema evolution isn’t supported for Iceberg tables.
The Snowpipe Streaming high-performance architecture limitations and Iceberg tables limitations also apply.