Managing Snowpipe

This topic describes the administrative tasks associated with managing Snowpipe.

In this Topic:

Loading Historic Data

Note

The information in this section pertains to automated data loads using event notifications. Calls to the Snowpipe REST API can load historic data without the need for additional steps.

An ALTER PIPE … REFRESH statement copies a set of data files staged within the previous 7 days to the Snowpipe ingest queue for loading into the target table. If you want to load data from files staged earlier, we recommend the following steps:

  1. Load the historic data into the target table by executing a COPY INTO <table> statement.

  2. Configure automatic data loads using Snowpipe with event notifications. Files that are newly staged will trigger event notifications for ingestion into the target table. Because the historic data files do not trigger event notifications, they are not loaded twice.

    For instructions, see:

    Amazon S3

    Automating Snowpipe for Amazon S3

    Google Cloud Storage

    Automating Snowpipe for Google Cloud Storage

    Microsoft Azure

    Automating Snowpipe for Microsoft Azure Blob Storage

  3. Execute an ALTER PIPE … REFRESH statement to queue any files staged in-between Steps 1 and 2. The statement checks the load history for both the target table and the pipe to ensure the same files are not loaded twice.

Changing the Cloud Parameters of the Referenced Stage

The cloud parameters of an external stage include the following:

  • URL

  • STORAGE_INTEGRATION

  • ENCRYPTION

After Snowpipe has been configured succesfully, if you need to modify any of the cloud parameters of the referenced stage, we recommend completing the following steps:

  1. Pause the pipe (using ALTER PIPE … SET PIPE_EXECUTION_PAUSED = true). Wait for any files currently queued to be loaded into the target table.

  2. Modify the stage parameters as required (using ALTER STAGE).

  3. Resume the pipe (using ALTER PIPE ... SET PIPE_EXECUTION_PAUSED = false).

Because pipes are not transactional, these steps ensure that Snowpipe queues files using the latest stage parameter values.

Warning

Modifying the URL parameter of a stage can cause any reliant pipes that leverage cloud messaging to trigger data loads (i.e. where AUTO_INGEST = TRUE) to stop working.

Transferring Pipe Ownership

Complete the following steps to transfer ownership of a pipe:

  1. Set the PIPE_EXECUTION_PAUSED parameter to TRUE.

    This parameter enables pausing or resuming a pipe. The parameter is supported at the following levels:

    • Account

    • Schema

    • Pipe

    At the pipe level, the object owner (or a parent role in a role hierarchy) can set the parameter to pause or resume an individual pipe.

    An account administrator (user with the ACCOUNTADMIN role) can set this parameter at the account level to pause or resume all pipes in the account. Likewise, a user with the MODIFY privilege on the schema can pause or resume pipes at the schema level. Note that this larger domain control only affects pipes for which the parameter was not already set at a lower level; e.g., by the owner at the object level.

  2. Transfer ownership of the pipe using GRANT OWNERSHIP.

  3. Force the pipe to resume (using SYSTEM$PIPE_FORCE_RESUME).

    This step allows the new owner to evaluate the pipe status and determine how many data files are waiting to be loaded using SYSTEM$PIPE_STATUS. We recommend verifying that only files approved for loading into the target table are queued.

Modifying the COPY Statement in a Pipe Definition

Complete the following steps to modify the COPY statement in a pipe definition; for example, when columns are added to the target table.

To execute the commands in this section, the current role for the user must have the OWNERSHIP privilege on the pipe.

  1. Pause the pipe (using ALTER PIPE … SET PIPE_EXECUTION_PAUSED=true).

  2. Query the SYSTEM$PIPE_STATUS function and verify that the pipe execution state is PAUSED and the pending file count is 0.

  3. Recreate the pipe to change the COPY statement in the definition. Choose either of the following options:

  4. Pause the pipe again.

  5. Review the configuration steps for your cloud messaging service to ensure the settings are still accurate:

  6. Resume the pipe (using ALTER PIPE … SET PIPE_EXECUTION_PAUSED = false).

  7. Query the SYSTEM$PIPE_STATUS function again and verify that the pipe execution state is RUNNING.

Note

The file loading metadata is associated with the pipe object rather than the table. Recreating the pipe removes the history of files loaded. Ensure that files already loaded by Snowpipe are not accidentally resubmitted to the pipe and loaded into the target table again. To view the query history for a table, query the COPY_HISTORY function.

Resuming a Stale Pipe

Note

This section only pertains to pipe objects that leverage cloud messaging to trigger data loads (i.e. where AUTO_INGEST = TRUE in the pipe definition).

When a pipe is paused, event messages received for the pipe enter a limited retention period. The period is 14 days by default. If a pipe is paused for longer than 14 days, it is considered stale.

To resume a stale pipe, a qualified role must call the SYSTEM$PIPE_FORCE_RESUME function and input the STALENESS_CHECK_OVERRIDE argument. This argument indicates an understanding that the role is resuming a stale pipe.

For example, resume the stale stalepipe1 pipe in the mydb.myschema database and schema:

SELECT SYSTEM$PIPE_FORCE_RESUME('mydb.myschema.stalepipe1’,’staleness_check_override`);

While the stale pipe was paused, if ownership of the pipe was transferred to another role, then resuming the pipe requires the additional OWNERSHIP_TRANSFER_CHECK_OVERRIDE argument. For example, resume the stale stalepipe2 pipe in the mydb.myschema database and schema, which transferred to a new role:

SELECT SYSTEM$PIPE_FORCE_RESUME('mydb.myschema.stalepipe1’,’staleness_check_override, ownership_transfer_check_override`);

As an event notification received while a pipe is paused reaches the end of the limited retention period, Snowflake schedules it to be dropped from the internal metadata. If the pipe is later resumed, Snowpipe processes these older notifications on a best effort basis. Snowflake cannot guarantee that they are processed.

For example, if a pipe is resumed 15 days after it was paused, Snowpipe generally skips any event notifications that were received on the first day the pipe was paused (i.e. that are now more than 14 days old). If the pipe is resumed 16 days after it was paused, Snowpipe generally skips any event notifications that were received on the first and second days after the pipe was paused. And so on.