Convert an Iceberg table¶
This topic provides information about converting an Iceberg table that Snowflake doesn’t manage into a table that uses Snowflake as the Iceberg catalog.
You might choose to convert a table when you want full Snowflake platform support, including support for the Snowflake Iceberg Catalog SDK.
To learn about the differences between Iceberg table types, see Iceberg catalog options.
Before and after table conversion¶
When you convert an Iceberg table to use Snowflake as the catalog, the table becomes writable and Snowflake assumes life-cycle management for it.
The following table compares Iceberg tables before and after conversion:
An external catalog (such as AWS Glue), or no catalog at all. Requires a catalog integration.
Snowflake. Snowflake registers changes to the source data and registers the changes in the Snowflake catalog. Snowflake then updates the table metadata on your external volume.
Does not require a catalog integration.
Snowflake read operations
Snowflake write operations
Storage location for table data and metadata
External volume (external cloud storage).
External volume (external cloud storage) under a base location that you specify.
Data and metadata cleanup
Managed by you or your external catalog.
Snowflake does not touch any externally generated data and metadata files.
Accessible from the Snowflake Iceberg Catalog SDK
When you convert an Iceberg table, Snowflake doesn’t lock down or assume sole access to your external storage. To prevent table corruption, ensure that you monitor and/or stop any non-Snowflake writes (such as automated maintenance jobs) to your external storage location.
Before you convert an Iceberg table, ensure that Snowflake can write to your external volume.
For Snowflake to write to your external volume, the following conditions must be met:
Use the ALTER ICEBERG TABLE … REFRESH command to manually refresh the table before you convert it.
ALLOW_WRITESproperty for your external volume is set to
TRUE. To update the value of this property for an existing external volume, use the ALTER EXTERNAL VOLUME command. For example:
ALTER EXTERNAL VOLUME my_ext_vol SET ALLOW_WRITES=TRUE.
The access control permissions that you set on the cloud storage account must allow write access. For example, if you use an external volume configured for Amazon S3, your IAM role must have the
s3:PutObjectpermission for your S3 location.
Example: Convert an Iceberg table¶
This example starts by creating an Iceberg table from Iceberg files in object storage.
Snowflake uses the
METADATA_FILE_PATH value to look for the table metadata in the following location for column definitions:
CREATE ICEBERG TABLE myIcebergTable
Next, use the ALTER ICEBERG TABLE … REFRESH command to synchronize the table metadata with the latest metadata file. The following example command refreshes the table by specifying a metadata file path.
ALTER ICEBERG TABLE myIcebergTable REFRESH 'metadata/v2.metadata.json';
Finally, convert the table to use Snowflake as the Iceberg catalog by using a ALTER ICEBERG TABLE … CONVERT TO MANAGED command.
ALTER ICEBERG TABLE myIcebergTable CONVERT TO MANAGED
BASE_LOCATION = myBaseLocation;
In this example, the ALTER statement must specify a
BASE_LOCATION because the table was created
from Iceberg files in object storage and
BASE_LOCATION was not part of the original CREATE ICEBERG TABLE statement.
BASE_LOCATION defines the relative path from your external
volume to a directory where Snowflake writes table data and metadata for the converted table.
BASE_LOCATION was specified in the original CREATE ICEBERG TABLE statement, you don’t need to include it in your
ALTER ICEBERG TABLE … CONVERT TO MANAGED command.
For example, Snowflake writes table data to
Snowflake writes metadata for the converted table to
Conversion and data types¶
If you convert a table with an Iceberg data type that Snowflake doesn’t support, Snowflake uses a Snowflake data type to process and return values. However, Snowflake writes the original Iceberg data type to the table data files so that you can use the table with other compute engines like Apache Spark.
For data types such as
long, the data type that Snowflake uses for processing supports a larger range of values than the source data type.
To stay consistent with the source data type, Snowflake does not allow inserting values outside the range that the
source data type supports.
For example, consider a table with a column of Iceberg type
int. After conversion, Snowflake processes the column values
using the Snowflake data type NUMBER(10,0). NUMBER(10,0) has a range of (-9,999,999,999, +9,999,999,999), but
int has a
more limited range of (-2,147,483,648, +2,147,483,647). If you try to insert a value of 3,000,000,000 into that column, Snowflake
returns an out-of-range error message.
For more information about data types, including limitations, see Iceberg table data types.