Managing Directory Tables¶
This topic provides instructions for creating and managing external or internal stages with directory tables.
Automatically Refreshing Directory Table Metadata¶
The metadata for a directory table can be refreshed automatically using the event messaging service for your cloud storage service.
The refresh operation synchronizes the metadata with the latest set of associated files in the external stage and path, i.e.:
New files in the path are added to the table metadata.
Changes to files in the path are updated in the table metadata.
Files no longer in the path are removed from the table metadata.
For instructions to create stages with automatically refreshed directory tables, see Automated Directory Table Metadata Refreshes.
Automatically refreshing the metadata is not available for directory tables on internal stages that reference external cloud storage. You must manually refresh the directory table metadata for these types of stages. For instructions, see Manually Refreshing a Directory Table (in this topic).
To register any missed files, we suggest that you follow our best practices for staging your data files
and periodically execute an ALTER STAGE … REFRESH statement. For satisfactory
performance, we also recommend that you use a selective path prefix with ALTER STAGE. Doing so reduces the number of files that need to be listed and
checked if they have been registered already (e.g.
bucket_name/YYYY/MM/DD/ or even
depending on your volume).
Creating a Stage with a Directory Table¶
This section provides instructions for creating stages (using CREATE STAGE) that layer a directory table to store metadata about the staged files.
Directory tables on internal stages require manual metadata refreshes. You could also choose to include a directory table on external stages and refresh the metadata manually. For information about automated metadata refreshes, see Automatically Refreshing Directory Table Metadata (in this topic).
The syntax for creating a stage with a directory table is nearly identical to creating a standard external or internal stage. Set the optional DIRECTORY parameter to TRUE.
After you create a stage with a directory table, you must execute ALTER STAGE … REFRESH to manually refresh the directory table metadata.
Create an internal stage named
mystage that includes a directory table. The stage references a file format named
CREATE STAGE mystage DIRECTORY = (ENABLE = TRUE) FILE_FORMAT = myformat;
Create an external stage named
mystage that includes a directory table. The stage references a bucket or container named
with a path of
files. Secure access to the cloud storage location is provided via the
my_storage_int storage integration:
The storage location in the URL value must end in a forward slash (
CREATE STAGE mystage URL='s3://load/files/' STORAGE_INTEGRATION = my_storage_int DIRECTORY = (ENABLE = TRUE);
Google Cloud Storage
CREATE STAGE mystage URL='gcs://load/files/' STORAGE_INTEGRATION = my_storage_int DIRECTORY = (ENABLE = TRUE);
CREATE STAGE mystage URL='azure://myaccount.blob.core.windows.net/load/files/' STORAGE_INTEGRATION = my_storage_int DIRECTORY = (ENABLE = TRUE);
Manually Refreshing a Directory Table¶
Refresh the metadata in a directory table manually using the ALTER STAGE command.
Manually refreshing metadata on an external stage blocks any automated refresh operations from occurring at the same time. Automated refreshes will resume after the manual refresh completes.
ALTER STAGE mystage REFRESH;