Snowpipe Streaming high-performance architecture with Apache Iceberg™ tables

Snowpipe Streaming with high-performance architecture supports ingesting data into Snowflake-managed Apache Iceberg tables, including both Iceberg v2 and Iceberg v3 tables. This enables near real-time streaming of data into Iceberg tables with all the performance benefits of the high-performance architecture.

Note

The classic architecture supports Iceberg v2 tables only. If you need Iceberg v3 support, you must use the high-performance architecture. For more information about Iceberg support in the classic architecture, see Snowpipe Streaming Classic with Apache Iceberg™ tables.

How it works

Snowpipe Streaming ingests data through the PIPE object into your target Iceberg table. Snowflake creates Iceberg-compatible Apache Parquet data files with corresponding Iceberg metadata, and uploads them to your configured external cloud storage location. The data is made available as a Snowflake-managed Iceberg table registered with Snowflake as the Iceberg catalog.

Snowflake connects to your storage location using an external volume.

Get started

This section provides a step-by-step example of how to set up Snowpipe Streaming with high-performance architecture to ingest data into an Iceberg table.

Step 1: Create an external volume

Create an external volume that specifies a storage location for your Iceberg table data.

Step 2: Create a Snowflake-managed Iceberg table

Create a Snowflake-managed Iceberg table with your configured external volume:

CREATE OR REPLACE ICEBERG TABLE my_iceberg_table (
    event_id NUMBER,
    event_type STRING,
    event_data VARIANT,
    event_timestamp TIMESTAMP_NTZ
)
    CATALOG = 'SNOWFLAKE'
    EXTERNAL_VOLUME = 'my_external_volume'
    BASE_LOCATION = 'my_iceberg_table/';

Step 3: Create a pipe for ingestion

Create a pipe that targets the Iceberg table. You can use the default pipe (automatically created) or create a custom pipe:

-- Option 1: Use the default pipe.
-- The default pipe is automatically created when you open a channel
-- against the table using the SDK. The default pipe name follows the
-- convention: <TABLE_NAME>-STREAMING (for example, MY_ICEBERG_TABLE-STREAMING).

-- Option 2: Create a custom pipe with explicit column mapping.
CREATE OR REPLACE PIPE my_iceberg_pipe AS
    COPY INTO my_iceberg_table (event_id, event_type, event_data, event_timestamp)
    FROM (SELECT $1:event_id, $1:event_type, $1:event_data, $1:event_timestamp);

Step 4: Stream data using the SDK

Configure the SDK to stream data into your Iceberg table through the pipe. Use the same SDK setup as described in Tutorial: Get started with Snowpipe Streaming high-performance architecture SDK, specifying your Iceberg table’s pipe in the client configuration.

Supported Iceberg versions

The high-performance architecture supports both Iceberg v2 and Iceberg v3 tables.

The classic architecture supports only Iceberg v2 tables.

Supported data types

The Snowflake Ingest SDK supports most of the Iceberg data types that Snowflake currently supports. For more information, see Data types for Apache Iceberg™ tables.

The SDK also supports ingestion into the three structured data types: Structured ARRAY, Structured OBJECT, and Structured MAP.

Usage notes

Limitations

The following limitations apply to Snowpipe Streaming with high-performance architecture and Iceberg tables:

  • Partitioned Iceberg tables aren’t supported.

  • Schema evolution isn’t supported for Iceberg tables.

The Snowpipe Streaming high-performance architecture limitations and Iceberg tables limitations also apply.