Querying directory tables¶

This topic covers how to query a directory table to retrieve a list of all files on a stage with metadata, such as the Snowflake file URL, for each file.

Syntax for querying a directory table:

SELECT * FROM DIRECTORY( @<stage_name> )
Copy

Where:

stage_name

Name of a stage that has a directory table enabled.

For information about SELECT as a statement, and the other clauses within the statement, see Query Syntax in the Snowflake SQL Command Reference.

Output¶

The output from a directory table query can include the following columns:

Column

Data Type

Description

RELATIVE_PATH

TEXT

Path to the files to access using the file URL.

SIZE

NUMBER

Size of the file (in bytes).

LAST_MODIFIED

TIMESTAMP_LTZ

Timestamp when the file was last updated in the stage.

MD5

HEX

MD5 checksum for the file.

ETAG

HEX

ETag header for the file.

FILE_URL

TEXT

Snowflake file URL to the file.

The file URL has the following format:

https://<account_identifier>/api/files/<db_name>.<schema_name>.<stage_name>/<relative_path>
Copy

Where:

account_identifier

Hostname of the Snowflake account for your stage. The hostname starts with an account locator (provided by Snowflake) and ends with the Snowflake domain (snowflakecomputing.com):

account_locator.snowflakecomputing.com

For more details, see Account identifiers.

Note

For Business Critical accounts, a privatelink segment is prepended to the URL just before snowflakecomputing.com (privatelink.snowflakecomputing.com), even if private connectivity to the Snowflake service is not enabled for your account.

db_name

Name of the database that contains the stage where your files are located.

schema_name

Name of the schema that contains the stage where your files are located.

stage_name

Name of the stage where your files are located.

relative_path

Path to the files to access using the file URL.

Usage notes¶

  • If files downloaded from an internal stage are corrupted, verify with the stage creator that ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE') is set for the stage.

Examples¶

This example retrieves all metadata columns in a directory table for a stage named mystage:

SELECT * FROM DIRECTORY(@mystage);
Copy

This example retrieves the FILE_URL column values from a directory table for files greater than 100 K bytes in size:

SELECT FILE_URL FROM DIRECTORY(@mystage) WHERE SIZE > 100000;
Copy

This example retrieves the FILE_URL column values from a directory table for comma-separated value files:

SELECT FILE_URL FROM DIRECTORY(@mystage) WHERE RELATIVE_PATH LIKE '%.csv';
Copy

Creating a view for unstructured data using a directory table¶

You can join a directory table with other Snowflake tables to produce a view of unstructured data that combines the file URLs with metadata about the files.

The following diagram illustrates how you can use a stage with a directory table enabled along with a separate data table to create a comprehensive view for unstructured files on a stage.

Create a comprehensive view of unstructured data by joining a directory table with another Snowflake table

Example: Creating a view of PDF files and their data

The following example creates a view called reports_information by joining a directory table on a stage named my_pdf_stage with a table named report_metadata using the file_url key. The stage contains PDF reports, while the report_metadata table contains structured information about each PDF report such as the author and publish_date. The resulting view provides a way to get information about the unstructured PDFs and their related, structured metadata.

CREATE VIEW reports_information AS
  SELECT
    file_url as report_link,
    author,
    publish_date,
    approved_date,
    geography,
    num_of_pages
  FROM directory(@my_pdf_stage) s
  JOIN report_metadata m
  ON s.file_url = m.file_url
Copy