Query metadata for staged files¶
Snowflake automatically generates metadata for files in internal (i.e. Snowflake) stages or external (Amazon S3, Google Cloud Storage, or Microsoft Azure) stages. This metadata is “stored” in virtual columns that can be:
Queried using a standard SELECT statement.
Loaded into a table, along with the regular data columns, using COPY INTO <table>. For general information about querying staged data files, see Query data in staged files.
Metadata columns¶
Currently, the following metadata columns can be queried or copied into tables:
- METADATA$FILENAME
Name of the staged data file the current row belongs to. Includes the full path to the data file.
- METADATA$FILE_ROW_NUMBER
Row number for each record in the staged data file.
- METADATA$FILE_CONTENT_KEY
Checksum of the staged data file the current row belongs to.
- METADATA$FILE_LAST_MODIFIED
Last modified timestamp of the staged data file the current row belongs to. Returned as TIMESTAMP_NTZ.
- METADATA$START_SCAN_TIME
Start timestamp of operation for each record in the staged data file. Returned as TIMESTAMP_LTZ.
Query limitations¶
Metadata cannot be inserted into existing table rows.
Metadata columns can only be queried by name; as such, they are not included in the output of any of the following statements:
Query examples¶
Example 1: Query the metadata columns for a CSV file¶
The following example illustrates staging multiple CSV data files (with the same file format) and then querying the metadata columns, as well as the regular data columns, in the files.
This example assumes the files have the following names and are located in the root directory in a macOS or Linux environment:
/tmp/data1.csvcontains two records:/tmp/data2.csvcontains two records:
To stage and query the files:
Note
The file format is required in this example to correctly parse the fields in the staged files. In the second query, the file format is omitted, causing the | field delimiter to
be ignored and resulting in the values returned for $1 and $2.
However, if the file format is included in the stage definition, you can omit it from the SELECT statement. See the next example for details.
Example 2: Query the metadata columns for a JSON file¶
This example illustrates staging a JSON data file containing the following objects and then querying the metadata columns, as well as the objects, in the file:
This example assumes the file is named /tmp/data1.json and is located in the root directory in a macOS or Linux environment.
To stage and query the file:
Example 3: Load metadata columns into a table¶
The COPY INTO <table> command supports copying metadata from staged data files into a target table. Use the data transformation syntax (i.e. a SELECT list) in your COPY statement. For more information about transforming data using a COPY statement, see Transform data during a load.
The following example loads the metadata columns and regular data columns from Example 1: Query the metadata columns for a CSV file into a table: