Openflow Connector for MongoDB: Iceberg table destinations

Note

This connector is subject to the Snowflake Connector Terms.

The Openflow Connector for MongoDB supports writing to Snowflake-managed Apache Apache Iceberg™ tables as an opt-in destination format. Iceberg v2 and v3 are both supported. Setting Table Storage Format = ICEBERG in the connector’s parameter context is the only change required. All Iceberg storage settings (external volume, catalog, version, and serialization policy) are inherited from the Snowflake destination database defaults.

Existing connectors using standard tables aren’t affected.

Prerequisites

  • Openflow runtime: An existing runtime to host the connector (minimum size Medium).
  • MongoDB source configured for replication: A replica set or sharded cluster with an oplog of sufficient size, and a user with readAnyDatabase privileges. Standalone instances aren’t supported. For details, see Set up the Openflow Connector for MongoDB.
  • Snowflake external volume: An external volume configured for Iceberg storage. See CREATE EXTERNAL VOLUME.
  • Snowflake destination database: An existing database configured with Iceberg parameters (next section).

Step 1: Configure the Snowflake destination database

Set the Iceberg defaults on the destination database. The connector reads these defaults at runtime, so no per-connector Iceberg configuration is needed beyond Table Storage Format:

If you need to create a new database for Iceberg destinations, you can set the Iceberg properties at creation time:

CREATE DATABASE <db>
  EXTERNAL_VOLUME = '<volume>'
  ICEBERG_VERSION_DEFAULT = <2|3>
  STORAGE_SERIALIZATION_POLICY = <COMPATIBLE|OPTIMIZED>;

To configure an existing database, use ALTER DATABASE:

ALTER DATABASE <db> SET
  EXTERNAL_VOLUME = '<volume>'
  ICEBERG_VERSION_DEFAULT = <2|3>
  STORAGE_SERIALIZATION_POLICY = <COMPATIBLE|OPTIMIZED>;
ParameterRequiredNotes
EXTERNAL_VOLUMEYesThe external volume for Iceberg file storage.
ICEBERG_VERSION_DEFAULTYes

2 or 3. The connector fails fast if this isn’t set.

STORAGE_SERIALIZATION_POLICYYes

COMPATIBLE produces Parquet files readable by external engines. OPTIMIZED enables Snowflake-specific query optimizations. Choose based on your data query needs. For more information, see STORAGE_SERIALIZATION_POLICY.

Note

CATALOG = 'SNOWFLAKE' is set automatically by the connector on each CREATE ICEBERG TABLE statement. Don’t set it at the database level.

The base location for each table is auto-derived using the flat layout: STORAGE_BASE_URL/database/schema/table_name.randomId/[data | metadata]/. No user configuration is needed.

Step 2: Set Table Storage Format in the connector’s parameter context

Set the Table Storage Format parameter to ICEBERG in the connector’s destination parameter context. The default is STANDARD.

For the full connector creation and configuration workflow, see Set up the Openflow Connector for MongoDB.

Step 3: Start and verify

Start the connector as usual. After the initial snapshot completes, verify the destination tables are Iceberg:

-- Confirm the table is Iceberg
SELECT GET_DDL('TABLE', '<db>.<schema>.<table>');

-- Confirm the Iceberg version on the database
SHOW PARAMETERS LIKE 'ICEBERG_VERSION_DEFAULT' IN DATABASE <db>;

Known limitations

  • No per-field type mapping: MongoDB stores entire documents as a single data column. Individual BSON fields aren’t mapped to separate Iceberg columns. All BSON types are serialized into the variant (v3) or string (v2) data column.

Table structure

The connector maps each MongoDB collection to a Snowflake Iceberg table with the following columns:

ColumnIceberg v3 typeIceberg v2 typeDescription
idstringstringThe MongoDB document _id field.
datavariantstringThe full document payload serialized as JSON.

Switching table storage format or Iceberg version

Switching between Standard and Iceberg, or between Iceberg v2 and v3, requires creating a new connector with a new destination database configured for the intended format and version. To reuse the same destination database, you need a full connector reset: drop all schemas in the destination database and then adjust the connector configuration.

  • Existing destination tables aren’t migrated. The new connector performs a fresh snapshot.
  • You can’t change Table Storage Format on a running connector. The table storage format is baked into every destination and journal table created and can only be changed after the connector has been stopped and after destination tables have been dropped.

References