Database Replication Considerations

This topic describes the behavior of certain Snowflake features in secondary databases and provides general guidance for working with replicated objects and data.

In this Topic:

Replication and Automatic Clustering

In the primary database, Snowflake monitors clustered tables using Automatic Clustering and reclusters them as needed. As part of a refresh operation, clustered tables are replicated to a secondary database with the current sorting of the table micro-partitions. As such, reclustering is not performed again on the clustered tables in the secondary database, which would be redundant.

If a secondary database contains clustered tables and the database is promoted to become the primary database, Snowflake begins Automatic Clustering of the tables in this database while simultaneously suspending the monitoring of clustered tables in the previous primary database.

See Replication and Materialized Views (in this topic) for information about Automatic Clustering for materialized views.

Replication and Materialized Views

In the primary database, Snowflake performs automatic background maintenance of materialized views. When a base table changes, all materialized views defined on the table are updated by a background service that uses compute resources provided by Snowflake. In addition, if Automatic Clustering is enabled for a materialized view, then the view is monitored and reclustered as necessary in the primary database.

A refresh operation replicates the materialized view definitions to a secondary database; the materialized view data is not replicated. Automatic background maintenance of materialized views in a secondary database is enabled by default. If Automatic Clustering is enabled for a materialized view in the primary database, automatic monitoring and reclustering of the materialized view in the secondary database is also enabled.

Also see Dangling References (in this topic).

Note

  • The charges for automated background synchronization of materialized views are billed to each account that contains a secondary database.

Replication and External Tables

External Tables in the Primary Database

External tables in the primary database currently cause the replication or refresh operation to fail with the following error message:

003906 (55000): SQL execution error:
Primary database contains an external table '<database_name>'.Replication of a database with external table is not supported

To work around this limitation, we recommend that you move the external tables into a separate database that is not replicated. Alternatively, if you are migrating your databases to another account, you could clone the primary database, drop the external table from the clone, and then replicate the cloned database. After you promote the secondary database in the target account, you would need to recreate the external tables in the database.

External Tables in a Secondary Database

External tables can exist in a secondary database if it was the primary database at a prior time and the external tables were created during that time. Once another database is promoted to be the designated primary database, it becomes a secondary database. External tables in a secondary database currently cause the refresh operation to fail with the following error:

003958 (55000): SQL execution error:
Secondary database contains an external table '<table_name>'. Replication of a database with external table is not supported.

To work around this limitation, we recommend that you move the external table into a separate database that is not a replicated secondary database.

Replication and Security Policies

Tags & Policies

These objects (masking policies, row access policies, and tags) and their assignments can be replicated using database replication and replication groups.

For database replication, the replication operation fails if either of the following conditions is true:

  • The primary database is in an Enterprise (or higher) account and contains a policy/tag but one or more of the accounts approved for replication are on lower editions.

  • An object contained in the primary database has a dangling reference to a tag in a different database.

The dangling reference behavior for database replication can be avoided when replicating multiple databases in a replication group.

Note

If using failover or failback actions, the Snowflake account must be Business Critical Edition or higher.

For more information, see Database Replication and Failover/Failback.

Tag-based masking policies

If a tag was previously replicated using database replication from the primary database to the secondary database, and a masking policy is assigned to a tag that is stored in a schema within the primary database, note the following regarding the refresh operation on the secondary database:

  • If the tag-based masking policy is assigned to a table, view, or column in the secondary account, the refresh operation works.

  • If the replicated tag is assigned to a schema or database in the secondary account, or assigned to the secondary account itself, the refresh operation for these objects fails.

For details on this policy, see Tag-based Masking Policies.

Network policies:

For details, see network policy replication.

Replication and Streams

As indicated in Replicated Database Objects, currently, table streams in a primary database are not replicated to any secondary databases. Delta streams can be created on secondary tables, but attempting to create append-only streams produces a user error. The CHANGES clause works similarly. To track the delta change data on tables in a secondary database, use the CHANGES clause in queries on tables in a secondary database.

In addition, if change tracking is enabled for a table, only the table version current when a secondary database is refreshed is replicated. Any table versions in-between refreshes are not replicated. To track append-only change data in tables in a secondary database, it is currently necessary to promote the secondary database to serve as the primary database. Then, users could create table streams in the same database or another database in the same account. Those table streams only track changes in the tables that were initiated and completed after the secondary database was promoted.

To create a stream at the point in time when the database that stores a table was promoted, use the CREATE STREAM … AT | BEFORE syntax. The AT | BEFORE clause determines the point in the past from which historical data is requested for the table.

Replication and Time Travel

Time Travel and Fail-safe data is maintained independently for a secondary database and is not replicated from the primary database. Querying tables and views in a secondary database using Time Travel can produce different results than when executing the same query in the primary database.

Historical Data

Historical data available to query in a primary database using Time Travel is not replicated to secondary databases.

For example, suppose data is loaded continuously into a table every 10 minutes using Snowpipe, and a secondary database is refreshed every hour. The refresh operation only replicates the latest version of the table. While every hourly version of the table within the retention window is available for query using Time Travel, none of the iterative versions within each hour (the individual Snowpipe loads) are available.

Data Retention Period

The data retention period for tables in a secondary database begins when the secondary database is refreshed with the DML operations (i.e. changing or deleting data) written to tables in the primary database.

Note

The data retention period parameter, DATA_RETENTION_TIME_IN_DAYS, is only replicated to database objects in the secondary database, not to the database itself. For more details about parameter replication, see Parameters.

Replication and Large, High-Churn Tables

When one or more rows of a table are updated or deleted, all of the impacted micro-partitions that store this data in a primary database are re-created and must be synchronized to secondary databases. For large, high-churn dimension tables, the replication costs can be significant.

For large, high-churn dimension tables that incur significant replication costs, the following mitigations are available:

  • Replicate any primary databases that store such tables at a lower frequency.

  • Change your data model to reduce churn.

For more information, see Managing Costs for Large, High-Churn Tables.

Replication and Cloning

Cloned objects are replicated physically rather than logically to secondary databases. That is, cloned tables in a standard database do not contribute to the overall data storage unless or until DML operations on the clone add to or modify existing data. However, when a cloned table is replicated to a secondary database, the physical data is also replicated, increasing the data storage usage for your account.

Dangling References

References to Objects in Another Database

Carefully analyze whether views or table constraints in a primary database reference objects in another database.

Views
  • Non-materialized views that reference any object in another database (e.g. table columns, other views, UDFs, or stages) can be replicated, because this type of reference is name based. Name-based references do not cause replication to fail; however, queries on the view in secondary databases will fail if the other database(s) are not replicated in the same region.

    For example, suppose view v1 in database d1 references tables t1 and t2 in databases d1 and d2, respectively. To successfully query view V1 in the secondary database d1, secondary database d2 must also exist in the account (e.g. as another secondary database). In addition, for consistent query results with the primary databases, secondary databases d1 and d2 must be refreshed at the same time.

  • Materialized views that reference any object in another database currently cause the replication or refresh operation to fail with the following error message:

    Dangling references in the snapshot. Correct the errors before refreshing again. The following references are missing (referred entity <- [referring entities])
    

    That is because materialized views reference objects by ID rather than name. A database snapshot cannot resolve ID-based references to objects outside the database.

    To work around this limitation, we recommend that you store materialized views and the objects they reference in the same database.

Constraints

Currently, dangling foreign keys cause the replication to fail with the following error message:

Dangling references in the snapshot. Correct the errors before refreshing again. The following references are missing (referred entity <- [referring entities])

This situation occurs when a foreign key in the primary database references a primary key in another database, or vice-versa. That is because constraint references are ID-based. A database snapshot cannot resolve ID-based references to objects outside its own database.

To view the foreign key references in your account, query the TABLE_CONSTRAINTS view in the Information Schema or ACCOUNT_USAGE schema.

To work around this limitation, we recommend that you store linked tables in the same database.

Sequences

Currently, dangling sequences cause the replication to fail with the following error message:

Dangling references in the snapshot. Correct the errors before refreshing again. The following references are missing (referred entity <- [referring entities])

This situation occurs when a table in a primary database references a sequence in another database. That is because sequence references are ID-based. A database snapshot cannot resolve ID-based references to objects outside its own database.

To work around this limitation, we recommend that you reference sequences in the same database.

Policies & Tags

A dangling reference for a masking policy, row access policy, or tag causes the replication to fail with the following error message:

Dangling references in the snapshot. Correct the errors before refreshing again. The following references are missing (referred entity <- [referring entities])

This situation occurs when the policy/tag and the object that has the policy/tag assigned to it exist in different databases. For example, a table named db1.s1.t1, a row access policy named db2.s1.rap1, and the row access policy is assigned to the table.

This dangling reference behavior applies to database replication.

To work around this, replicate databases using replication groups.

A dangling reference can also occur when a network policy is assigned to a user in the context of Replicating Account Objects Across Multiple Accounts. For details, see:

References to Dropped Objects

Dropping an object that is referenced by another object in the same, or another, database results in a dangling reference. When an object in the primary database references a dropped object, a replication operation fails with the following error message:

Dangling references in the snapshot. Correct the errors before refreshing again. The following references are missing (referred entity <- [referring entities])

To work around this limitation, we recommend that you complete any one of the following steps:

  • Undrop any referenced objects.

  • Modify the referring objects (for example, modify a materialized view using ALTER MATERIALIZED VIEW). Either reference a different object or remove the reference to the dropped object.

  • Drop any objects in the primary database that reference dropped objects.

Replication and Access Control

Secondary databases are read-only; however, the database owner (i.e. the role that has the OWNERSHIP privilege on the secondary database) can grant privileges on database objects (i.e. schemas, tables, views, etc.) to other roles using GRANT <privileges> … TO ROLE.

Replication of Multiple Databases

When multiple databases are replicated, point in time consistency across databases is not available. A snapshot of each primary database is created independently and changes to the secondary database are committed independently. This can be problematic if you have views that join across tables in different databases or depend on cross-database transactions. For example, a transaction that updates two primary databases atomically might not be reflected in the secondary databases at the same time.

Historical Usage Data

Historical usage data for activity in the primary database is not replicated to secondary databases. Each account has its own query history, login history, etc.

Historical usage data includes the query data returned by the following Snowflake Information Schema table functions or Account Usage views:

  • COPY_HISTORY

  • LOGIN_HISTORY

  • QUERY_HISTORY

  • etc.

Back to top