Continuous data pipeline examples¶
This topic provides practical examples of use cases for data pipelines.
Prerequisites¶
The role used to execute the SQL statements in these examples requires the following access control privileges:
EXECUTE TASKGlobal EXECUTE TASK privilege to run tasks
USAGEUSAGE privilege on the database and schema in which the SQL statements are executed, as well as on the warehouse that runs any tasks in these examples.
CREATE objectVarious
CREATE objectprivileges on the schema in which the SQL statements are executed, to create objects such as tables, streams, and tasks.
For more information about access control in Snowflake, see Overview of Access Control.
Transform loaded JSON data on a schedule¶
The following example loads raw JSON data into a single landing table named raw. Two tasks query table streams created on the raw table and insert subsets of rows into multiple tables. Because each task consumes the change data capture records in a table stream, multiple streams are required.
Unload data on a schedule¶
The following example unloads the change data capture records in a stream into an internal (i.e. Snowflake) stage.
Refresh external table metadata on a schedule¶
The following example refreshes the metadata for an external table named mydb.myschema.exttable (using ALTER EXTERNAL TABLE … REFRESH) on a schedule.
Note
When an external table is created, the AUTO_REFRESH parameter is set to TRUE by default. We recommend that you accept this default value for external tables that reference data files in either Amazon S3 or Microsoft Azure stages. However, the automatic refresh option is not available currently for external tables that reference Google Cloud Storage stages. For these external tables, manually refreshing the metadata on a schedule can be useful.