Use Apache Iceberg™ tables with Snowflake Open Catalog in Snowflake¶
Use Apache Iceberg™ tables in Snowflake to work with Snowflake Open Catalog.
What is Snowflake Open Catalog?¶
Open Catalog is a catalog implementation for Iceberg built on the open source Apache Iceberg REST protocol. To learn more, see the Snowflake Open Catalog documentation.
Snowflake supports the following options for working with Open Catalog:
Considerations¶
When using Snowflake with Open Catalog, be aware of the following considerations:
Storage
Just like Snowflake-managed Iceberg tables, you store Iceberg tables managed by Open Catalog in external cloud storage.
Iceberg tables in Snowflake use an external volume to provide access to your cloud storage, while tables managed by Open Catalog use a storage configuration.
Configuration for syncing Snowflake-managed Iceberg tables
To sync a Snowflake-managed table with Open Catalog, you must first create an external volume in Snowflake and then create an external catalog in Open Catalog that points to the same location as the external volume. For more information, see Sync a Snowflake-managed table with Snowflake Open Catalog.
Table access
Snowflake-managed Iceberg tables that you sync with Open Catalog are read-only in Open Catalog.
Snowflake can query but can’t write to tables managed by Open Catalog.
Terminology differences¶
This section summarizes the key differences in terminology between Snowflake and Open Catalog.
Snowflake term |
Open Catalog term |
---|---|
Open Catalog uses catalogs, which are like databases in Snowflake. In Open Catalog, you create one or more catalog resources to organize Iceberg tables under namespaces. For more information, see Catalog in the Open Catalog documentation. When you sync a Snowflake-managed table with Open Catalog, it syncs to the catalog you specify in the catalog integration. Also,
it syncs with two parent namespaces, which are named after its database and schema in Snowflake. For example, if you have a
|
|
In Open Catalog, the concepts of schema and namespace are synonymous and can be used interchangeably. Namespace is displayed in the Open Catalog user interface. Open Catalog uses namespaces to hold a collection of objects and the term _namespace_ is primarily used in the Open Catalog documentation. For more information about namespaces, see Namespace. However, if you’re using a third-party query engine, such as Apache Spark, and you run the CREATE SCHEMA or CREATE DATABASE command, you create a namespace in Open Catalog. You can also run the CREATE NAMESPACE command to create a namespace. |
|
Like Snowflake, Open Catalog also uses namespaces but with key differences compared to how Snowflake uses namespaces. A catalog in Open Catalog comprises top-level namespaces, which you define, along with any number of nested namespaces beneath them, which you also define. Nested namespaces allow you to register tables with the same name within the same catalog. For example, a catalog named
Also, in Open Catalog, you can group tables under any namespace in the namespace hierarchy, including top-level namespaces. For more information about namespaces, including a conceptual diagram of a sample Open Catalog structure, see key concepts of Open Catalog. |
|
In Open Catalog, principal roles are like roles in Snowflake but with key differences. You don’t grant privileges to a principal role. Instead, you grant privileges to a catalog role, which you then grant to a principal role, and then you grant the principal role to a service principal, thus bestowing the privileges on the service principal. Also, you can’t assign principal roles to other principal roles. You can only grant one principal role to a service principal. You can use a principal role to logically group service principals together. The scope of a principal role is across all catalogs. Also, there aren’t different types of principal roles. For more information, see Principal role in the Open Catalog documentation. |
|
Open Catalog uses catalog roles, which are like database roles in Snowflake. Catalog roles specify a set of permissions for actions on a catalog or objects in the catalog. The scope of a catalog role is the catalog where it is created. In Open Catalog, you grant privileges to catalog roles. Next, you grant catalog roles to principal roles, and then you grant principal roles to service principals, which grants access to resources. You can grant multiple catalog roles to a principal role but only one principal role to a service principal. For more information, see Catalog role in the Open Catalog documentation. |
|
In the context of access control, there is no concept of a user in Open Catalog. In Open Catalog, privileges are bestowed on service principals, not users. Query engines use service principals to connect to catalogs. For more information, see Service principal in the Open Catalog documentation. |
Legal Notices¶
Apache®, Apache Iceberg™, Apache Spark™, Apache Flink®, and Flink® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.