Introduction to Unstructured Data Support¶
This topic introduces key concepts and provides instructions for accessing and sharing unstructured data files.
In this Topic:
Cloud Storage Service Support¶
Both external (external cloud storage) and internal (i.e. Snowflake) stages support unstructured data.
- External stages
Store files in external cloud storage: Amazon S3, Google Cloud Storage, or one of the supported Microsoft Azure cloud storage services:
Blob storage
Data Lake Storage Gen2
General-purpose v1
General-purpose v2
Types of URLs Available to Access Files¶
The following types of URLs are available to access files in cloud storage:
- Scoped URL
Encoded URL that permits temporary access to a staged file without granting privileges to the stage.
The URL expires when the persisted query result period ends (i.e. the results cache expires), which is currently 24 hours.
- File URL
URL that identifies the database, schema, stage, and file path to a set of files. A role that has sufficient privileges on the stage can access the files.
- Pre-signed URL
Simple HTTPS URL used to access a file via a web browser. A file is temporarily accessible to users via this URL using a pre-signed access token. The expiration time for the access token is configurable.
The following table describes key characteristics of these URL types:
Scoped URL |
File URL |
Pre-signed URL |
|
---|---|---|---|
Use cases |
Recommended for file administrators to give scoped access to data files to specific roles in the same account. Provide access to the files via a view that retrieves scoped URLs. Only roles that have privileges on the view can access the files. Ideal for use in custom applications, providing unstructured data to other accounts via a share, or for downloading and ad hoc analysis of unstructured data via Snowsight. |
Permanent URL to a file on a stage. To download or access a file, users send the file URL in a GET request to the REST API endpoint along with the authorization token. Ideal for custom applications that require access to unstructured data files. |
Used to download or access files without authenticating into Snowflake or passing an authorization token. Pre-signed URLs are open; any user or application can directly access or download the files. Ideal for business intelligence applications or reporting tools that need to display the unstructured file contents. |
How to generate |
Query the BUILD_SCOPED_FILE_URL function. |
Either Query the directory table for the stage that references the staged files or call the BUILD_STAGE_FILE_URL function. |
Query the GET_PRESIGNED_URL function. |
Usage |
The following options are available:
|
The following options are available:
|
The following options are available:
|
Unstructured data files can be accessed by data consumers via column values of this type in secure views shared by data providers. |
Unstructured data files cannot be accessed by data consumers via column values of this type in secure views shared by data providers. |
Unstructured data files can be accessed by data consumers via column values of this type in secure views shared by data providers. |
|
Authorization |
Only the user who generates a scoped URL can use the URL to access the referenced file. |
Role specified in the GET REST API call must have sufficient privileges on the stage: USAGE (external stage) or READ (internal stage). |
Any person who has the pre-signed URL can access the referenced file for the life of the token. |
Expiration |
Expiration period for the query results cache (currently 24 hours). |
Permanent. |
Length of time specified in the |
Server-side Encryption Only for Stages¶
To support file access using pre-signed URLs, create named internal or external stages with server-side encryption rather than client-side encryption. When files in a stage are client-side encrypted, users cannot read the staged files without having access to the encryption key.
Note
Currently, creating internal stages with server-side encryption is limited to the following Snowflake client versions:
JDBC Driver v3.12.11 (or higher)
Directory Tables¶
Directory tables store a catalog of staged files in cloud storage. Roles with sufficient privileges can query a directory table to retrieve file URLs to access the staged files.
For details, see Directory Tables.
SQL Functions¶
The following File Functions are provided to access data files:
SQL Function |
Description |
---|---|
Returns the URL for an external or internal named stage using the stage name as the input. |
|
Extracts the path of a staged file relative to its location in the stage using the stage name and absolute file path in cloud storage as inputs. |
|
Returns the absolute path of a staged file using the stage name and path of the file relative to its location in the stage as inputs. |
|
Generates the pre-signed URL to a staged file using the stage name and relative file path as inputs. Access files in an external stage using the function. |
|
Generates a scoped Snowflake-hosted URL to a staged file using the stage name and relative file path as inputs. |
|
Generates a Snowflake-hosted file URL to a staged file using the stage name and relative file path as inputs. |
Downloading Staged Files in Snowsight¶
Users can click on a generated scoped, pre-signed, or file URL in the results table of a Snowsight worksheet and download the referenced file.
Log into Snowsight, the Snowflake web interface.
Click Worksheets » My Worksheets (or open a local worksheet under Recent or in Folders) » <worksheet_name>.
Return a scoped, pre-signed, or file URL in a query using any one of the supported methods.
Click on the URL in the results table. Snowsight downloads the file referenced by the URL.