This topic provides important considerations when cloning objects in Snowflake, particularly databases, schemas, and non-temporary tables. Factors such as DDL and DML transactions (on the source object), Time Travel, and data retention periods can affect the object clone.
In this Topic:
Access Control Privileges for Cloned Objects¶
A cloned object does not retain any granted privileges on the source object itself (i.e. clones do not automatically have the same privileges as their sources). A system administrator or the owner of the cloned object must explicitly grant any required privileges to the newly-created clone.
However, if the source object is a database or schema, for child objects contained in the source, the clone replicates all granted privileges on the corresponding child objects:
For databases, contained objects include schemas, tables, views, etc.
For schemas, contained objects include tables, views, etc.
Cloning and Snowflake Objects¶
This section describes special cloning considerations with regard to specific Snowflake objects.
Cloning and Stages¶
Individual external named stages can be cloned. An external stage references a bucket or container in external cloud storage; cloning an external stage has no impact on the referenced cloud storage.
Internal (i.e. Snowflake) named stages cannot be cloned.
When cloning a database or schema:
External named stages that were present in the source when the cloning operation started are cloned.
Tables are cloned, which means the internal stage associated with each table is also cloned. Any data files that were present in a table stage in the source database or schema are not copied to the clone (i.e. the cloned table stages are empty).
Internal named stages are not cloned.
Cloning and Pipes¶
When a database or schema is cloned, any pipes in the source container that reference an internal (i.e. Snowflake) stage are not cloned.
However, any pipes that reference an external stage are cloned. This results in the following behavior:
If the target table is fully qualified in the COPY statement in the pipe definition (in the form of
schema_name.table_name), this can result in duplicate data getting loaded into the target table in the source database or schema by each pipe.
If the target table is not fully qualified in the pipe definition, then the data is loaded into the target table (e.g.
mytable) in the source and cloned databases/schemas.
A pipe clone is paused by default. A pipe can be resumed by executing an ALTER PIPE … RESUME statement.
Cloning and Streams¶
Currently, when a database or schema that contains source tables and streams is cloned, any unconsumed records in the streams (in the clone) are inaccessible. This behavior is consistent with Time Travel for tables. If a table is cloned, historical data for the table clone begins at the time/point when the clone was created.
Impact of DDL on Cloning¶
Cloning is fast, but not instantaneous, particularly for large objects (e.g. tables). As such, if DDL statements are executed on source objects (e.g. renaming tables in a schema) while the cloning operation is in progress, the changes may not be represented in the clone. This is because DDL statements are atomic and not part of multi-statement transactions.
Furthermore, Snowflake does not record which object names were present when the cloning operation started and which names changed. As such, DDL statements that rename (or drop and recreate) source child objects compete with any in-progress cloning operations and can cause name conflicts.
In the following example, the
t_sales table is dropped and another table is altered and given the same name as the dropped table while
the parent database is being cloned, producing an error:
CREATE OR REPLACE DATABASE staging_sales CLONE sales; DROP TABLE sales.public.t_sales; ALTER TABLE sales.public.t_sales_20170522 RENAME TO sales.public.t_sales; 002002 (42710): None: SQL compilation error: Object 'T_SALES' already exists.
To avoid conflicts in name resolution during a cloning operation, we suggest refraining from renaming objects to a name previously used by a dropped object until cloning is completed.
Impact of DML and Data Retention on Cloning¶
The DATA_RETENTION_TIME_IN_DAYS parameter specifies the number of days for which Snowflake retains historical data for performing
Time Travel actions on an object. Because the data retained for Time Travel incurs storage costs at the table-level, some users set this parameter
0 for some tables, effectively disabling data retention for these tables (i.e. when the value is set to
0, Time Travel data
retained for DML transactions is purged, incurring negligible additional storage costs).
Cloning operations require time to complete, particularly for large tables. During this period, DML transactions can alter the data in a source
table. Subsequently, Snowflake attempts to clone the table data as it existed when the operation began. However, if data is purged for DML
transactions that occur during cloning (because the retention time for the table is
0), the data is unavailable to complete the operation,
producing an error similar to the following:
ProgrammingError occured: "000707 (02000): None: Data is not available." with query id None
As a workaround, we recommend either of the following best practices when cloning an object:
Refrain, if possible, from executing DML transactions on the source object (or any of its children) until after the cloning operation completes.
If this isn’t possible, prior to starting cloning, set
DATA_RETENTION_TIME_IN_DAYS=1for all tables in the schema (or database if you are cloning an entire database). Once the operation completes, remember to reset the parameter value back to
0for those tables in the source, if desired.
You might also want to set the value to
0for the cloned tables (if you plan to make DML changes to the cloned tables and do not wish to incur additional storage costs for Time Travel on the tables).
Cloning Using Time Travel (Databases, Schemas, Tables, and Streams Only)¶
This section provides information to consider when using Time Travel to clone objects at a specific time/point in the past.
Cloning of Historical Objects¶
If the source object did not exist at the time/point set in the AT | BEFORE clause, an error is returned.
In the following example, a CREATE TABLE … CLONE statement attempts to clone the source table at a point in the past (30 minutes prior) when it didn’t exist:
CREATE TABLE t_sales (numeric integer) data_retention_time_in_days=1; CREATE OR REPLACE TABLE sales.public.t_sales_20170522 CLONE sales.public.t_sales at(offset => -60*30); 002003 (02000): SQL compilation error: Object 'SALES.PUBLIC.T_SALES' does not exist.
Any child object in a cloned database or schema that did not exist at the specified time/point is not cloned.
If a pipe object was recreated (using the CREATE OR REPLACE PIPE syntax) or dropped since the point in time specified in the AT | BEFORE clause, the cloning operation fails.
Cloning of Historical Object Metadata¶
An object clone inherits the name and structure of the source object current at the time the CREATE <object> … CLONE statement is executed or at a specified time/point in the past using Time Travel. An object clone inherits any other metadata, such as comments or table clustering keys, that is current in the source object at the time the statement is executed, regardless of whether Time Travel is used.
To ensure consistent behavior in long cloning operations, when an AT or BEFORE clause is not specified for a CREATE <object> … CLONE statement, the cloning operation internally sets the AT clause value as the timestamp when the statement was initiated.