Automated directory table metadata refreshes¶
You can automatically refresh the metadata for a directory table on an internal or external stage.
The refresh operation synchronizes the metadata with the latest set of associated files in storage, and occurs in response to the following types of changes:
New files in the path are added to the table metadata.
Files in the path are updated in the table metadata.
Files no longer in the path are removed from the table metadata.
Internal stages¶
Automatically refreshing the directory table on an internal stage synchronizes the metadata with the latest set of associated files in the internal named stage and path when the following occur:
New files in the path are added to the table metadata.
Changes to files in the path are updated in the table metadata.
Files no longer in the path are removed from the table metadata.
Create an internal named stage with a directory table enabled¶
Create an internal named stage with a directory table enabled by using the CREATE STAGE command. Snowflake reads your staged data files into the directory table metadata.
CREATE STAGE my_int_stage
DIRECTORY = (
ENABLE = TRUE
AUTO_REFRESH = TRUE
);
External stages¶
You can automatically refresh the metadata for a directory table by using the following event notification services:
Amazon S3: Amazon SQS (Simple Queue Service)
Google Cloud Storage: Google Cloud Pub/Sub
Microsoft Azure: Microsoft Azure Event Grid
To set up automated refreshes, see the topic for the cloud storage service where your files are located:
Refreshing directory tables automatically for Google Cloud Storage
Refreshing directory tables automatically for Azure Blob Storage
Cross-cloud support¶
Snowflake supports cross-cloud, cross-region automated directory table refreshes for external stages.
The following table shows the cross-cloud options that Snowflake supports for automated directory table refreshes, based on the cloud platform that hosts your Snowflake account.
Amazon S3 |
Google Cloud Storage |
Microsoft Azure Blob storage |
Microsoft Data Lake Storage Gen2 |
Microsoft Azure General-purpose v2 |
|
---|---|---|---|---|---|
Accounts hosted on AWS |
✔ |
✔ |
✔ |
✔ |
✔ |
Accounts hosted on GCP |
✔ |
✔ |
✔ |
✔ |
✔ |
Accounts hosted on Azure |
✔ |
✔ |
✔ |
✔ |
✔ |
Considerations¶
Automated refreshes are event-based and provide better performance that manual refreshes for large or fast-growing stages.
Automated refreshes for internal stages is currently available for accounts hosted on AWS. Snowflake doesn’t support refreshing the directory table metadata on an internal stage when your account is hosted on Google Cloud or Azure.
Next Topics: