This topic provides important considerations when cloning objects in Snowflake, particularly databases, schemas, and non-temporary tables. Factors such as DDL and DML transactions (on the source object), Time Travel, and data retention periods can affect the object clone.
In this Topic:
Access Control Privileges for Cloned Objects¶
For most Snowflake objects, a cloned object does not retain any privileges granted on the source object. That is, the object clones do not automatically have the same privileges as their source objects. The owner of the cloned object or a role with the MANAGE GRANTS privilege must explicitly grant any required privileges to the newly-created clone. The exception is tables. The CREATE TABLE … CLONE command syntax supports the COPY GRANTS parameter. When the COPY GRANTS parameter is specified in a CREATE TABLE statement, the create operation copies all privileges, except OWNERSHIP, from the source table to the new table.
If the source object is a database or schema, for child objects contained in the source, the clone replicates all granted privileges on the corresponding child objects:
For databases, contained objects include schemas, tables, views, etc.
For schemas, contained objects include tables, views, etc.
Cloning and Snowflake Objects¶
This section describes special cloning considerations with regard to specific Snowflake objects.
Cloning and Object Parameters¶
Object parameters are set at the account level but can be overridden for individual objects. Cloned objects inherit the parameter value set at the account level by default. If a parameter is explicitly set on a source database, schema, or table, the object clone inherits the override value in accordance with the inheritance hierarchy for the parameter. For more information about object parameters, see Parameters.
Cloning and Default Sequences¶
In a table, a column can reference a sequence that generates default values. When a table is cloned, the cloned table references the source or cloned sequence:
If the database or schema containing both the table and sequence is cloned, the cloned table references the cloned sequence.
Otherwise, the cloned table references the source sequence.
For example, if the sequence is defined in a different database or schema, the cloned table references the source sequence. Or if you clone just the table itself, the cloned table references the source sequence.
If you do not want the new table to continue using the source sequence, run the following command:
ALTER TABLE <table_name> ALTER COLUMN <column_name> SET DEFAULT <new_sequence>.nextval;
Cloning and Foreign Key Constraints¶
A table can have a foreign key constraint that references a table that includes the primary key. When a table with a foreign key constraint is cloned, the cloned table references the source or cloned table that includes the primary key:
If the database or schema containing both tables is cloned, the cloned table with the foreign key references the primary key in the other cloned table.
If the tables are in separate databases or schemas, the cloned table references the primary key in the source table.
Cloning and Stages¶
Individual external named stages can be cloned. An external stage references a bucket or container in external cloud storage; cloning an external stage has no impact on the referenced cloud storage.
Internal (i.e. Snowflake) named stages cannot be cloned.
When cloning a database or schema:
External named stages that were present in the source when the cloning operation started are cloned.
Tables are cloned, which means the internal stage associated with each table is also cloned. Any data files that were present in a table stage in the source database or schema are not copied to the clone (i.e. the cloned table stages are empty).
Internal named stages are not cloned.
Cloning and Pipes¶
When a database or schema is cloned, any pipes in the source container that reference an internal (i.e. Snowflake) stage are not cloned.
However, any pipes that reference an external stage are cloned. This includes any pipe objects where the INTEGRATION parameter is set. This parameter points to a notification integration to enable auto-ingest Snowpipe when loading data from files in Google Cloud Storage or Microsoft Azure blob storage.
When a data file is created in a stage location (e.g. blob storage container), a copy of the notification is sent to every pipe that matches the stage location. This results in the following behavior:
If a table is fully qualified in the COPY statement in the pipe definition (in the form of
schema_name.table_name), then Snowpipe loads duplicate data into the source table (i.e. the
database.schema.tablein the COPY statement) for each pipe.
If a table is not fully qualified in the pipe definition, then Snowpipe loads the data into the table (e.g.
mytable) in the source and cloned databases/schemas.
The default state of a pipe clone is as follows:
AUTO_INGEST = FALSE, a cloned pipe is paused by default.
AUTO_INGEST = TRUE, a cloned pipe is set to the
STOPPED_CLONEDstate. In this state, pipes do not accumulate event notifications as a result of newly staged files. When a pipe is explicitly resumed, it only processes data files triggered as a result of new event notifications.
A pipe clone in either state can be resumed by executing an ALTER PIPE … RESUME statement.
Cloning and Streams¶
Currently, when a database or schema that contains source tables and streams is cloned, any unconsumed records in the streams (in the clone) are inaccessible. This behavior is consistent with Time Travel for tables. If a table is cloned, historical data for the table clone begins at the time/point when the clone was created.
Impact of DDL on Cloning¶
Cloning is fast, but not instantaneous, particularly for large objects (e.g. tables). As such, if DDL statements are executed on source objects (e.g. renaming tables in a schema) while the cloning operation is in progress, the changes may not be represented in the clone. This is because DDL statements are atomic and not part of multi-statement transactions.
Furthermore, Snowflake does not record which object names were present when the cloning operation started and which names changed. As such, DDL statements that rename (or drop and recreate) source child objects compete with any in-progress cloning operations and can cause name conflicts.
In the following example, the
t_sales table is dropped and another table is altered and given the same name as the dropped table while
the parent database is being cloned, producing an error:
CREATE OR REPLACE DATABASE staging_sales CLONE sales; DROP TABLE sales.public.t_sales; ALTER TABLE sales.public.t_sales_20170522 RENAME TO sales.public.t_sales; 002002 (42710): None: SQL compilation error: Object 'T_SALES' already exists.
To avoid conflicts in name resolution during a cloning operation, we suggest refraining from renaming objects to a name previously used by a dropped object until cloning is completed.
Impact of DML and Data Retention on Cloning¶
The DATA_RETENTION_TIME_IN_DAYS parameter specifies the number of days for which Snowflake retains historical data for performing
Time Travel actions on an object. Because the data retained for Time Travel incurs storage costs at the table-level, some users set this parameter
0 for some tables, effectively disabling data retention for these tables (i.e. when the value is set to
0, Time Travel data
retained for DML transactions is purged, incurring negligible additional storage costs).
Cloning operations require time to complete, particularly for large tables. During this period, DML transactions can alter the data in a source
table. Subsequently, Snowflake attempts to clone the table data as it existed when the operation began. However, if data is purged for DML
transactions that occur during cloning (because the retention time for the table is
0), the data is unavailable to complete the operation,
producing an error similar to the following:
ProgrammingError occured: "000707 (02000): None: Data is not available." with query id None
As a workaround, we recommend either of the following best practices when cloning an object:
Refrain, if possible, from executing DML transactions on the source object (or any of its children) until after the cloning operation completes.
If this is not possible, prior to starting cloning, set
DATA_RETENTION_TIME_IN_DAYS=1for all tables in the schema (or database if you are cloning an entire database). Once the operation completes, remember to reset the parameter value back to
0for those tables in the source, if desired.
You might also want to set the value to
0for the cloned tables (if you plan to make DML changes to the cloned tables and do not wish to incur additional storage costs for Time Travel on the tables).
Cloning Using Time Travel (Databases, Schemas, Tables, and Streams Only)¶
This section provides information to consider when using Time Travel to clone objects at a specific time/point in the past.
Cloning of Historical Objects¶
If the source object did not exist at the time/point set in the AT | BEFORE clause, an error is returned.
In the following example, a CREATE TABLE … CLONE statement attempts to clone the source table at a point in the past (30 minutes prior) when it didn’t exist:
CREATE TABLE t_sales (numeric integer) data_retention_time_in_days=1; CREATE OR REPLACE TABLE sales.public.t_sales_20170522 CLONE sales.public.t_sales at(offset => -60*30); 002003 (02000): SQL compilation error: Object 'SALES.PUBLIC.T_SALES' does not exist.
Any child object in a cloned database or schema that did not exist at the specified time/point is not cloned.
The cloning operation fails in the following scenarios:
If the specified Time Travel time is beyond the retention time of any current child of the cloned database or schema.
If a pipe object with
AUTO-INGEST = TRUEset was recreated (using the CREATE OR REPLACE PIPE syntax) or dropped since the point in time specified in the AT | BEFORE clause. This limitation does not apply to pipe objects created for manual Snowpipe ingest using the REST API (i.e. with
AUTO-INGEST = FALSE).
Cloning of Historical Object Metadata¶
An object clone inherits the name and structure of the source object current at the time the CREATE <object> … CLONE statement is executed or at a specified time/point in the past using Time Travel. An object clone inherits any other metadata, such as comments or table clustering keys, that is current in the source object at the time the statement is executed, regardless of whether Time Travel is used.
To ensure consistent behavior in long cloning operations, when an AT or BEFORE clause is not specified for a CREATE <object> … CLONE statement, the cloning operation internally sets the AT clause value as the timestamp when the statement was initiated.