Database Replication Considerations

This topic describes the behavior of certain Snowflake features in secondary databases and provides general guidance for working with replicated objects and data.

In this Topic:

This section describes the behavior of certain Snowflake features in secondary databases and provides general guidance for working with replicated objects and data.

Replication and Automatic Clustering

In the primary database, Snowflake monitors clustered tables using Automatic Clustering and reclusters them as needed. As part of a refresh operation, clustered tables are replicated to a secondary database with the current sorting of the table micro-partitions. As such, reclustering is not performed again on the clustered tables in the secondary database, which would be redundant.

If a secondary database contains clustered tables and the database is promoted to become the primary database, Snowflake begins Automatic Clustering of the tables in this database while simultaneously suspending the monitoring of clustered tables in the previous primary database.

See Replication and Materialized Views (in this topic) for information about Automatic Clustering for materialized views.

Replication and Materialized Views

In the primary database, Snowflake performs automatic background maintenance of materialized views. When a base table changes, all materialized views defined on the table are updated by a background service that uses compute resources provided by Snowflake. In addition, if Automatic Clustering is enabled for a materialized view, then the view is monitored and reclustered as necessary in the primary database.

A refresh operation replicates the materialized view definitions to a secondary database; however, the materialized view data is not replicated, meaning some or all of the data in the materialized views could become out of date.

To perform automatic background maintenance of materialized views in a secondary database, explicitly set AUTO_REFRESH_MATERIALIZED_VIEWS_ON_SECONDARY = TRUE on the secondary database either when you create the secondary database (using CREATE DATABASE … AS REPLICA OF) or later (using ALTER DATABASE). If Automatic Clustering is enabled for a materialized view in the primary database, then setting AUTO_REFRESH_MATERIALIZED_VIEWS_ON_SECONDARY = TRUE on a secondary database also enables automatic monitoring and reclustering of the materialized view in the secondary database.

Also see References to Objects in Another Database (in this topic).

Note

Materialized view maintenance charges are billed to the account where the AUTO_REFRESH_MATERIALIZED_VIEWS_ON_SECONDARY parameter is enabled for a secondary database.

Replication and Masking Policies

If either of the following conditions is true, then the initial replication operation or a subsequent refresh operation fails:

  • The primary database is in an Enterprise (or higher) account and contains a masking policy but one or more of the accounts approved for replication are on lower editions.

  • A masking policy contained in the primary database is applied to columns in a table or view in another database, or vice-versa.

Time Travel

Querying tables and views in a secondary database using Time Travel can produce different results than when executing the same query in the primary database.

Historical Data

Historical data available to query in a primary database using Time Travel is not replicated to secondary databases.

For example, suppose data is loaded continuously into a table every 10 minutes using Snowpipe, and a secondary database is refreshed every hour. The refresh operation only replicates the latest version of the table. While every hourly version of the table within the retention window is available for query using Time Travel, none of the iterative versions within each hour (the individual Snowpipe loads) are available.

Data Retention Period

The data retention period for tables in a secondary database begins when the secondary database is refreshed with the DML operations (i.e. changing or deleting data) written to tables in the primary database.

Replication and Large, High-Churn Tables

When one or more rows of a table are updated or deleted, all of the impacted micro-partitions that store this data in a primary database are re-created and must be synchronized to secondary databases. For large, high-churn dimension tables, the replication costs can be significant.

For large, high-churn dimension tables that incur significant replication costs, the following mitigations are available:

  • Replicate any primary databases that store such tables at a lower frequency.

  • Change your data model to reduce churn.

For more information, see Managing Costs for Large, High-Churn Tables.

Replication and Cloning

Cloned objects are replicated physically rather than logically to secondary databases. That is, cloned tables in a standard database do not contribute to the overall data storage unless or until DML operations on the clone add to or modify existing data. However, when a cloned table is replicated to a secondary database, the physical data is also replicated, increasing the data storage usage for your account.

References to Objects in Another Database

Carefully analyze whether views or table constraints in a primary database reference another database.

Views
  • Non-materialized views that reference any object in another database (e.g. table columns, other views, UDFs, or stages) can be replicated, because this type of reference is name based. Name-based references do not cause replication to fail; however, queries on the view in secondary databases will fail if the other database(s) are not replicated in the same region.

    For example, suppose view v1 in database d1 references tables t1 and t2 in databases d1 and d2, respectively. To successfully query view V1 in the secondary database d1, secondary database d2 must also exist in the account (e.g. as another secondary database). In addition, for consistent query results with the primary databases, secondary databases d1 and d2 must be refreshed at the same time.

  • Materialized views that reference any object in another database currently cause the replication or refresh operation to fail. That is because materialized views reference objects by ID rather than name. A database snapshot cannot resolve ID-based references to objects outside the database.

Constraints

Currently, dangling foreign keys will cause the replication to fail. This situation occurs when a foreign key in the primary database references a primary key in another database, or vice-versa. That is because constraint references are ID-based. A database snapshot cannot resolve ID-based references to objects outside its own database.

To view the foreign key references in your account, query the TABLE_CONSTRAINTS view in the Information Schema or ACCOUNT_USAGE schema.

Replication and Access Control

Secondary databases are read-only; however, the database owner (i.e. the role that has the OWNERSHIP privilege on the secondary database) can grant privileges on database objects (i.e. schemas, tables, views, etc.) to other roles using GRANT <privileges> … TO ROLE.

Historical Usage Data

Historical usage data for activity in the primary database is not replicated to secondary databases. Each account has its own query history, login history, etc.

Historical usage data includes the query data returned by the following Information Schema table functions or Account Usage views:

  • COPY_HISTORY

  • LOGIN_HISTORY

  • QUERY_HISTORY

  • etc.