- Categories:
CREATE STAGE¶
Creates a new named internal or external stage to use for loading data from files into Snowflake tables and unloading data from tables into files:
- Internal stage
Stores data files internally within Snowflake. Internal stages can be either permanent or temporary. For more details, see Choosing a Stage for Local Files.
- External stage
References data files stored in a location outside of Snowflake. Currently, the following cloud storage services are supported:
Amazon S3 buckets
Google Cloud Storage buckets
Microsoft Azure containers
The storage location can be either private/protected or public.
- See also:
In this Topic:
Syntax¶
-- Internal stage
CREATE [ OR REPLACE ] [ TEMPORARY ] STAGE [ IF NOT EXISTS ] <internal_stage_name>
[ FILE_FORMAT = ( { FORMAT_NAME = '<file_format_name>' | TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] ) } ]
[ COPY_OPTIONS = ( copyOptions ) ]
[ COMMENT = '<string_literal>' ]
-- External stage
CREATE [ OR REPLACE ] [ TEMPORARY ] STAGE [ IF NOT EXISTS ] <external_stage_name>
externalStageParams
[ FILE_FORMAT = ( { FORMAT_NAME = '<file_format_name>' | TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] ) } ]
[ COPY_OPTIONS = ( copyOptions ) ]
[ COMMENT = '<string_literal>' ]
Where:
externalStageParams (for Amazon S3) ::= URL = 's3://<bucket>[/<path>/]' [ { STORAGE_INTEGRATION = <integration_name> } | { CREDENTIALS = ( { { AWS_KEY_ID = '<string>' AWS_SECRET_KEY = '<string>' [ AWS_TOKEN = '<string>' ] } | AWS_ROLE = '<string>' } ) ) } ] [ ENCRYPTION = ( [ TYPE = 'AWS_CSE' ] [ MASTER_KEY = '<string>' ] | [ TYPE = 'AWS_SSE_S3' ] | [ TYPE = 'AWS_SSE_KMS' [ KMS_KEY_ID = '<string>' ] | [ TYPE = NONE ] ) ]externalStageParams (for Google Cloud Storage) ::= URL = 'gcs://<bucket>[/<path>/]' [ STORAGE_INTEGRATION = <integration_name> ] [ ENCRYPTION = ( [ TYPE = 'GCS_SSE_KMS' ] [ KMS_KEY_ID = '<string>' ] | [ TYPE = NONE ] ) ]externalStageParams (for Microsoft Azure) ::= URL = 'azure://<account>.blob.core.windows.net/<container>[/<path>/]' [ { STORAGE_INTEGRATION = <integration_name> } | { CREDENTIALS = ( [ AZURE_SAS_TOKEN = <string> ] ) } ] [ ENCRYPTION = ( [TYPE = { 'AZURE_CSE' | NONE } ] [ MASTER_KEY = '<string>' ] ) ]formatTypeOptions ::= -- If TYPE = CSV COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE RECORD_DELIMITER = '<character>' | NONE FIELD_DELIMITER = '<character>' | NONE FILE_EXTENSION = '<string>' SKIP_HEADER = <integer> SKIP_BLANK_LINES = TRUE | FALSE DATE_FORMAT = '<string>' | AUTO TIME_FORMAT = '<string>' | AUTO TIMESTAMP_FORMAT = '<string>' | AUTO BINARY_FORMAT = HEX | BASE64 | UTF8 ESCAPE = '<character>' | NONE ESCAPE_UNENCLOSED_FIELD = '<character>' | NONE TRIM_SPACE = TRUE | FALSE FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | NONE NULL_IF = ( '<string>' [ , '<string>' ... ] ) ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | FALSE REPLACE_INVALID_CHARACTERS = TRUE | FALSE VALIDATE_UTF8 = TRUE | FALSE EMPTY_FIELD_AS_NULL = TRUE | FALSE SKIP_BYTE_ORDER_MARK = TRUE | FALSE ENCODING = '<string>' | UTF8 -- If TYPE = JSON COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE DATE_FORMAT = '<string>' | AUTO TIME_FORMAT = '<string>' | AUTO TIMESTAMP_FORMAT = '<string>' | AUTO BINARY_FORMAT = HEX | BASE64 | UTF8 TRIM_SPACE = TRUE | FALSE NULL_IF = ( '<string>' [ , '<string>' ... ] ) FILE_EXTENSION = '<string>' ENABLE_OCTAL = TRUE | FALSE ALLOW_DUPLICATE = TRUE | FALSE STRIP_OUTER_ARRAY = TRUE | FALSE STRIP_NULL_VALUES = TRUE | FALSE REPLACE_INVALID_CHARACTERS = TRUE | FALSE IGNORE_UTF8_ERRORS = TRUE | FALSE SKIP_BYTE_ORDER_MARK = TRUE | FALSE -- If TYPE = AVRO COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE TRIM_SPACE = TRUE | FALSE NULL_IF = ( '<string>' [ , '<string>' ... ] ) -- If TYPE = ORC TRIM_SPACE = TRUE | FALSE NULL_IF = ( '<string>' [ , '<string>' ... ] ) -- If TYPE = PARQUET COMPRESSION = AUTO | LZO | SNAPPY | NONE SNAPPY_COMPRESSION = TRUE | FALSE BINARY_AS_TEXT = TRUE | FALSE TRIM_SPACE = TRUE | FALSE NULL_IF = ( '<string>' [ , '<string>' ... ] ) -- If TYPE = XML COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE IGNORE_UTF8_ERRORS = TRUE | FALSE PRESERVE_SPACE = TRUE | FALSE STRIP_OUTER_ELEMENT = TRUE | FALSE DISABLE_SNOWFLAKE_DATA = TRUE | FALSE DISABLE_AUTO_CONVERT = TRUE | FALSE SKIP_BYTE_ORDER_MARK = TRUE | FALSE TRIM_SPACE = TRUE | FALSE NULL_IF = ( '<string>' [ , '<string>' ... ] )copyOptions ::= ON_ERROR = { CONTINUE | SKIP_FILE | SKIP_FILE_<num> | SKIP_FILE_<num>% | ABORT_STATEMENT } SIZE_LIMIT = <num> PURGE = TRUE | FALSE RETURN_FAILED_ONLY = TRUE | FALSE MATCH_BY_COLUMN_NAME = CASE_SENSITIVE | CASE_INSENSITIVE | NONE ENFORCE_LENGTH = TRUE | FALSE TRUNCATECOLUMNS = TRUE | FALSE FORCE = TRUE | FALSE
Required Parameters¶
internal_stage_name
or .external_stage_name
Specifies the identifier for the stage; must be unique for the schema in which the stage is created.
In addition, the identifier must start with an alphabetic character and cannot contain spaces or special characters unless the entire identifier string is enclosed in double quotes (e.g.
"My object"
). Identifiers enclosed in double quotes are also case-sensitive.For more details, see Identifier Requirements.
Note
When creating an external stage, a URL is also required. For more details, see External Stage Parameters (in this topic).
If a URL is not specified, Snowflake creates an internal stage by default.
Optional Parameters¶
TEMPORARY
Specifies that the stage created is temporary and will be dropped at the end of the session in which it was created. Note:
When a temporary external stage is dropped, only the stage itself is dropped; the data files are not removed.
When a temporary internal stage is dropped, all of the files in the stage are purged from Snowflake, regardless of their load status. This prevents files in temporary internal stages from using data storage and, consequently, accruing storage charges. However, this also means that the staged files cannot be recovered through Snowflake once the stage is dropped.
Tip
If you plan to create and use temporary internal stages, you should maintain copies of your data files outside of Snowflake.
FILE_FORMAT = ( FORMAT_NAME = 'file_format_name' )
or .FILE_FORMAT = ( TYPE = CSV | JSON | AVRO | ORC | PARQUET | XML [ ... ] )
Specifies the file format for the stage, which can be either:
FORMAT_NAME = 'file_format_name'
Specifies an existing named file format to use for the stage. The named file format determines the format type (CSV, JSON, etc.), as well as any other format options, for the data files loaded using this stage. For more details, see CREATE FILE FORMAT.
TYPE = CSV | JSON | AVRO | ORC | PARQUET | XML [ ... ]
Specifies the type of files for the stage:
Loading data from a stage (using COPY INTO <table>) accommodates all of the supported format types.
Unloading data into a stage (using COPY INTO <location>) accommodates
CSV
,JSON
, orPARQUET
.
If a file format type is specified, additional format-specific options can be specified. For more details, see Format Type Options (formatTypeOptions) (in this topic).
Default:
TYPE = CSV
Note
FORMAT_NAME
andTYPE
are mutually exclusive; you can only specify one or the other for a stage.COPY_OPTIONS = ( ... )
Specifies one (or more) copy options for the stage. For more details, see Copy Options (copyOptions) (in this topic).
COMMENT = 'string_literal'
Specifies a comment for the stage.
Default: No value
External Stage Parameters (externalStageParams
)¶
URL = 'cloud_specific_url'
If this parameter is omitted, Snowflake creates an internal stage
Important
Enclose the URL in single quotes (
''
) in order for Snowflake to identify the string. If the quotes are omitted, any credentials you supply may be displayed in plain text in the history. We strongly recommend verifying the syntax of the CREATE STAGE statement before you execute it.The Create Stage wizard in the Snowflake web interface automatically encloses field values in quotation characters, as needed. Using the web interface to create stages is a good option for less-experienced users.
Amazon S3
URL = 's3://bucket[/path/]'
Specifies the URL for the external location (existing S3 bucket) used to store data files for loading/unloading, where:
bucket
is the name of the S3 bucket.Note that S3 buckets in non-public AWS regions, such as government regions, are not supported.
path
is an optional case-sensitive path for files in the cloud storage location (i.e. files have names that begin with a common string) that limits the set of files. Paths are alternatively called prefixes or folders by different cloud storage services.
Google Cloud Storage
URL = 'gcs://bucket[/path/]'
Specifies the URL for the external location (existing GCS bucket) used to store data files for loading/unloading, where:
bucket
is the name of the GCS bucket.path
is an optional case-sensitive path for files in the cloud storage location (i.e. files have names that begin with a common string) that limits the set of files. Paths are alternatively called prefixes or folders by different cloud storage services.
Microsoft Azure
URL = 'azure://account.blob.core.windows.net/container[/path/]'
Specifies the URL for the external location (existing Azure container) used to store data files for loading, where:
account
is the name of the Azure account (e.g.myaccount
). Use theblob.core.windows.net
endpoint for all supported types of Azure blob storage accounts, including Data Lake Storage Gen2.container
is the name of the Azure container (e.g.mycontainer
).path
is an optional case-sensitive path for files in the cloud storage location (i.e. files have names that begin with a common string) that limits the set of files. Paths are alternatively called prefixes or folders by different cloud storage services.
Default: No value (an internal stage is created)
STORAGE_INTEGRATION = integration_name
or .CREDENTIALS = ( cloud_specific_credentials )
Required only if the storage location is private/protected; not required for public buckets/containers
Amazon S3
STORAGE_INTEGRATION = integration_name
Specifies the name of the storage integration used to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity. For more details, see CREATE STORAGE INTEGRATION.
Note
We highly recommend the use of storage integrations. This option avoids the need to supply cloud storage credentials using the CREDENTIALS parameter when creating stages or loading data.
CREDENTIALS = ( AWS_KEY_ID = 'string' AWS_SECRET_KEY = 'string' [ AWS_TOKEN = 'string' ] )
or .CREDENTIALS = ( AWS_ROLE = 'string' )
Specifies the security credentials for connecting to AWS and accessing the private/protected S3 bucket where the files to load/unload are staged. For more information, see Configuring Secure Access to Amazon S3.
The credentials you specify depend on whether you associated the Snowflake access permissions for the bucket with an AWS IAM (Identity & Access Management) user or role:
IAM user: IAM credentials are required. Temporary (aka “scoped”) credentials are generated by AWS Security Token Service (STS) and consist of three components:
AWS_KEY_ID
AWS_SECRET_KEY
AWS_TOKEN
All three are required to access a private/protected bucket. After a designated period of time, temporary credentials expire and can no longer be used. You must then generate a new set of valid temporary credentials.
Important
The COPY command also allows permanent (aka “long-term”) credentials to be used; however, for security reasons, Snowflake does not recommend using them. If you must use permanent credentials, Snowflake recommends periodically generating new permanent credentials for external stages.
IAM role: Omit the security credentials and access keys and, instead, identify the role using
AWS_ROLE
and specify the AWS role ARN (Amazon Resource Name).
Google Cloud Storage
STORAGE_INTEGRATION = integration_name
Specifies the name of the storage integration used to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity. For more details, see CREATE STORAGE INTEGRATION.
Microsoft Azure
STORAGE_INTEGRATION = integration_name
Specifies the name of the storage integration used to delegate authentication responsibility for external cloud storage to a Snowflake identity and access management (IAM) entity. For more details, see CREATE STORAGE INTEGRATION.
Note
We highly recommend the use of storage integrations. This option avoids the need to supply cloud storage credentials using the CREDENTIALS parameter when creating stages or loading data.
CREDENTIALS = ( AZURE_SAS_TOKEN = 'string' )
Specifies the SAS (shared access signature) token for connecting to Azure and accessing the private/protected container where the files containing loaded data are staged. Credentials are generated by Azure.
Default: No value (no credentials are provided for the external stage)
ENCRYPTION = ( cloud_specific_encryption )
Required only for loading from/unloading into encrypted files; not required if storage location and files are unencrypted
- Data loading
Modifies the encryption settings used to decrypt encrypted files in the storage location and extract data.
- Data unloading
Modifies the encryption settings used to encrypt files unloaded to the storage location.
Amazon S3
ENCRYPTION = ( [ TYPE = 'AWS_CSE' ] [ MASTER_KEY = '<string>' ] | [ TYPE = 'AWS_SSE_S3' ] | [ TYPE = 'AWS_SSE_KMS' [ KMS_KEY_ID = '<string>' ] | [ TYPE = NONE ] )
TYPE = ...
Specifies the encryption type used. Possible values are:
AWS_CSE
: Client-side encryption (requires aMASTER_KEY
value). Currently, the client-side master key you provide can only be a symmetric key. Note that, when aMASTER_KEY
value is provided, Snowflake assumesTYPE = AWS_CSE
(i.e. when aMASTER_KEY
value is provided,TYPE
is not required).AWS_SSE_S3
: Server-side encryption that requires no additional encryption settings.AWS_SSE_KMS
: Server-side encryption that accepts an optionalKMS_KEY_ID
value.
For more information about the encryption types, see the AWS documentation for client-side encryption or server-side encryption.
MASTER_KEY = 'string'
(applies toAWS_CSE
encryption only)Specifies the client-side master key used to encrypt the files in the bucket. The master key must be a 128-bit or 256-bit key in Base64-encoded form.
KMS_KEY_ID = 'string'
(applies toAWS_SSE_KMS
encryption only)Optionally specifies the ID for the AWS KMS-managed key used to encrypt files unloaded into the bucket. If no value is provided, your default KMS key ID is used to encrypt files on unload.
Note that this value is ignored for data loading.
Google Cloud Storage
ENCRYPTION = ( [ TYPE = 'GCS_SSE_KMS' ] [ KMS_KEY_ID = '<string>' ] | [ TYPE = NONE ] )
TYPE = ...
Specifies the encryption type used. Possible values are:
GCS_SSE_KMS
: Server-side encryption that accepts an optionalKMS_KEY_ID
value.
For more information, see the Google Cloud Platform documentation:
KMS_KEY_ID = 'string'
(applies toGCS_SSE_KMS
encryption only)Optionally specifies the ID for the Cloud KMS-managed key that is used to encrypt files unloaded into the bucket. If no value is provided, your default KMS key ID set on the bucket is used to encrypt files on unload.
Note that this value is ignored for data loading. The load operation should succeed if the service account has sufficient permissions to decrypt data in the bucket.
Microsoft Azure
ENCRYPTION = ( [ TYPE = 'AZURE_CSE' | NONE ] [ MASTER_KEY = 'string' ] )
TYPE = ...
Specifies the encryption type used. Possible values are:
AZURE_CSE
: Client-side encryption (requires a MASTER_KEY value). For information, see the Client-side encryption information in the Microsoft Azure documentation.NONE
: No encryption.
MASTER_KEY = 'string'
(applies to AZURE_CSE encryption only)Specifies the client-side master key used to encrypt or decrypt files. The master key must be a 128-bit or 256-bit key in Base64-encoded form.
Format Type Options (formatTypeOptions
)¶
Depending on the file format type specified (FILE_FORMAT = ( TYPE = ... )
), you can include one or more of the following format-specific options (separated by blank spaces, commas, or new lines):
TYPE = CSV¶
COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
- Use
Data loading and unloading
- Definition
When loading data, specifies the current compression algorithm for the data file. Snowflake uses this option to detect how an already-compressed data file was compressed so that the compressed data in the file can be extracted for loading.
When unloading data, compresses the data file using the specified compression algorithm.
- Values
Supported Values
Notes
AUTO
When loading data, compression algorithm detected automatically, except for Brotli-compressed files, which cannot currently be detected automatically. When unloading data, files are automatically compressed using the default, which is gzip.
GZIP
BZ2
BROTLI
Must be specified when loading/unloading Brotli-compressed files.
ZSTD
Zstandard v0.8 (and higher) is supported.
DEFLATE
Deflate-compressed files (with zlib header, RFC1950).
RAW_DEFLATE
Raw Deflate-compressed files (without header, RFC1951).
NONE
When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.
- Default
AUTO
RECORD_DELIMITER = 'character' | NONE
- Use
Data loading and unloading
- Definition
One or more singlebyte or multibyte characters that separate records in an input file (data loading) or unloaded file (data unloading).
Accepts common escape sequences, octal values (prefixed by
\\
), or hex values (prefixed by0x
). For example, for records delimited by the thorn (Þ
) character, specify the octal (\\336
) or hex (0xDE
) value. Also accepts a value ofNONE
.The specified delimiter must be a valid UTF-8 character and not a random sequence of bytes.
Multiple-character delimiters are also supported; however, the delimiter for RECORD_DELIMITER or FIELD_DELIMITER cannot be a substring of the delimiter for the other file format option (e.g.
FIELD_DELIMITER = 'aa' RECORD_DELIMITER = 'aabb'
). The delimiter is limited to a maximum of 20 characters. Do not specify characters used for other file format options such asESCAPE
orESCAPE_UNENCLOSED_FIELD
.- Default
- Data loading
New line character. Note that “new line” is logical such that
\r\n
will be understood as a new line for files on a Windows platform.- Data unloading
New line character (
\n
).
FIELD_DELIMITER = 'character' | NONE
- Use
Data loading and unloading
- Definition
One or more singlebyte or multibyte characters that separate fields in an input file (data loading) or unloaded file (data unloading).
Accepts common escape sequences, octal values (prefixed by
\\
), or hex values (prefixed by0x
). For example, for fields delimited by the thorn (Þ
) character, specify the octal (\\336
) or hex (0xDE
) value. Also accepts a value ofNONE
.The specified delimiter must be a valid UTF-8 character and not a random sequence of bytes.
Multiple-character delimiters are also supported; however, the delimiter for RECORD_DELIMITER or FIELD_DELIMITER cannot be a substring of the delimiter for the other file format option (e.g.
FIELD_DELIMITER = 'aa' RECORD_DELIMITER = 'aabb'
). The delimiter is limited to a maximum of 20 characters. Do not specify characters used for other file format options such asESCAPE
orESCAPE_UNENCLOSED_FIELD
.- Default
comma (
,
)
FILE_EXTENSION = 'string' | NONE
- Use
Data unloading only
- Definition
Specifies the extension for files unloaded to a stage. Accepts any extension. The user is responsible for specifying a file extension that can be read by any desired software or services.
- Default
null, meaning the file extension is determined by the format type:
.csv[compression]
, wherecompression
is the extension added by the compression method, ifCOMPRESSION
is set.
Note
If the
SINGLE
copy option isTRUE
, then the COPY command unloads a file without a file extension by default. To specify a file extension, provide a file name and extension in theinternal_location
orexternal_location
path (e.g.copy into @stage/data.csv
).SKIP_HEADER = integer
- Use
Data loading only
- Definition
Number of lines at the start of the file to skip.
Note that SKIP_HEADER does not use the RECORD_DELIMITER or FIELD_DELIMITER values to determine what a header line is; rather, it simply skips the specified number of CRLF (Carriage Return, Line Feed)-delimited lines in the file. RECORD_DELIMITER and FIELD_DELIMITER are then used to determine the rows of data to load.
- Default
0
SKIP_BLANK_LINES = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies to skip any blank lines encountered in the data files; otherwise, blank lines produce an end-of-record error (default behavior).
Default:
FALSE
DATE_FORMAT = 'string' | AUTO
- Use
Data loading and unloading
- Definition
Defines the format of date values in the data files (data loading) or table (data unloading). If a value is not specified or is
AUTO
, the value for the DATE_INPUT_FORMAT (data loading) or DATE_OUTPUT_FORMAT (data unloading) parameter is used.- Default
AUTO
TIME_FORMAT = 'string' | AUTO
- Use
Data loading and unloading
- Definition
Defines the format of time values in the data files (data loading) or table (data unloading). If a value is not specified or is
AUTO
, the value for the TIME_INPUT_FORMAT (data loading) or TIME_OUTPUT_FORMAT (data unloading) parameter is used.- Default
AUTO
TIMESTAMP_FORMAT = string' | AUTO
- Use
Data loading and unloading
- Definition
Defines the format of timestamp values in the data files (data loading) or table (data unloading). If a value is not specified or is
AUTO
, the value for the TIMESTAMP_INPUT_FORMAT (data loading) or TIMESTAMP_OUTPUT_FORMAT (data unloading) parameter is used.- Default
AUTO
BINARY_FORMAT = HEX | BASE64 | UTF8
- Use
Data loading and unloading
- Definition
Defines the encoding format for binary input or output. The option can be used when loading data into or unloading data from binary columns in a table.
- Default
HEX
ESCAPE = 'character' | NONE
- Use
Data loading and unloading
- Definition
Single character string used as the escape character for field values. An escape character invokes an alternative interpretation on subsequent characters in a character sequence. You can use the ESCAPE character to interpret instances of the
FIELD_DELIMITER
,RECORD_DELIMITER
, orFIELD_OPTIONALLY_ENCLOSED_BY
characters in the data as literals. The escape character can also be used to escape instances of itself in the data.
Accepts common escape sequences, octal values, or hex values.
When loading data, specifies the escape character for enclosed fields. Specify the character used to enclose fields by setting
FIELD_OPTIONALLY_ENCLOSED_BY
.When unloading data, if this option is set, it overrides the escape character set for
ESCAPE_UNENCLOSED_FIELD
.
- Default
NONE
ESCAPE_UNENCLOSED_FIELD = 'character' | NONE
- Use
Data loading and unloading
- Definition
Single character string used as the escape character for unenclosed field values only. An escape character invokes an alternative interpretation on subsequent characters in a character sequence. You can use the ESCAPE character to interpret instances of the
FIELD_DELIMITER
orRECORD_DELIMITER
characters in the data as literals. The escape character can also be used to escape instances of itself in the data.
Accepts common escape sequences, octal values, or hex values.
Note that when unloading data, if
ESCAPE
is set, the escape character set for that file format option overrides this option.- Default
backslash (
\\
)
TRIM_SPACE = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to remove white space from fields.
For example, if your external database software encloses fields in quotes, but inserts a leading space, Snowflake reads the leading space rather than the opening quotation character as the beginning of the field (i.e. the quotation marks are interpreted as part of the string of field data). Set this option to
TRUE
to remove undesirable spaces during the data load.As another example, if leading or trailing spaces surround quotes that enclose strings, you can remove the surrounding spaces using this option and the quote character using the
FIELD_OPTIONALLY_ENCLOSED_BY
option. Note that any spaces within the quotes are preserved. For example, assumingFIELD_DELIMITER = '|'
andFIELD_OPTIONALLY_ENCLOSED_BY = '"'
:|"Hello world"| /* loads as */ >Hello world< |" Hello world "| /* loads as */ > Hello world < | "Hello world" | /* loads as */ >Hello world<
(the brackets in this example are not loaded; they are used to demarcate the beginning and end of the loaded strings)
- Default
FALSE
FIELD_OPTIONALLY_ENCLOSED_BY = 'character' | NONE
- Use
Data loading and unloading
- Definition
Character used to enclose strings. Value can be
NONE
, single quote character ('
), or double quote character ("
). To use the single quote character, use the octal or hex representation (0x27
) or the double single-quoted escape (''
).When a field contains this character, escape it using the same character. For example, if the value is the double quote character and a field contains the string
A "B" C
, escape the double quotes as follows:A ""B"" C
- Default
NONE
NULL_IF = ( 'string1' [ , 'string2' , ... ] )
- Use
Data loading and unloading
- Definition
String used to convert to and from SQL NULL:
When loading data, Snowflake replaces these strings in the data load source with SQL NULL. To specify more than one string, enclose the list of strings in parentheses and use commas to separate each value.
For example:
NULL_IF = ('\\N', 'NULL', 'NUL', '')
Note that this option can include empty strings.
When unloading data, Snowflake converts SQL NULL values to the first value in the list.
- Default
\\N
(i.e. NULL, which assumes theESCAPE_UNENCLOSED_FIELD
value is\\
)
ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to generate a parsing error if the number of delimited columns (i.e. fields) in an input file does not match the number of columns in the corresponding table.
If set to
FALSE
, an error is not generated and the load continues. If the file is successfully loaded:If the input file contains records with more fields than columns in the table, the matching fields are loaded in order of occurrence in the file and the remaining fields are not loaded.
If the input file contains records with fewer fields than columns in the table, the non-matching columns in the table are loaded with NULL values.
This option assumes all the records within the input file are the same length (i.e. a file containing records of varying length return an error regardless of the value specified for this parameter).
- Default
TRUE
Note
When transforming data during loading (i.e. using a query as the source for the COPY command), this option is ignored. There is no requirement for your data files to have the same number and ordering of columns as your target table.
REPLACE_INVALID_CHARACTERS = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to replace invalid UTF-8 characters with the Unicode replacement character (�).
If set to
TRUE
, Snowflake replaces invalid UTF-8 characters with the Unicode replacement character.If set to
FALSE
, the load operation produces an error when invalid UTF-8 character encoding is detected.- Default
FALSE
VALIDATE_UTF8 = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to validate UTF-8 character encoding in string column data.
If set to
TRUE
, Snowflake validates UTF-8 character encoding in string column data. When invalid UTF-8 character encoding is detected, the COPY command produces an error.- Default
TRUE
Important
This option is provided only to ensure backward compatibility with earlier versions of Snowflake. You should not disable this option unless instructed by Snowflake Support.
EMPTY_FIELD_AS_NULL = TRUE | FALSE
- Use
Data loading and unloading
- Definition
When loading data, specifies whether to insert SQL NULL for empty fields in an input file, which are represented by two successive delimiters (e.g.
,,
).If set to
FALSE
, Snowflake attempts to cast an empty field to the corresponding column type. An empty string is inserted into columns of type STRING. For other column types, the COPY command produces an error.When unloading data, this option is used in combination with
FIELD_OPTIONALLY_ENCLOSED_BY
. WhenFIELD_OPTIONALLY_ENCLOSED_BY = NONE
, settingEMPTY_FIELD_AS_NULL = FALSE
specifies to unload empty strings in tables to empty string values without quotes enclosing the field values.If set to
TRUE
,FIELD_OPTIONALLY_ENCLOSED_BY
must specify a character to enclose strings.
- Default
TRUE
SKIP_BYTE_ORDER_MARK = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to skip the BOM (byte order mark), if present in a data file. A BOM is a character code at the beginning of a data file that defines the byte order and encoding form.
If set to
FALSE
, Snowflake recognizes any BOM in data files, which could result in the BOM either causing an error or being merged into the first column in the table.- Default
TRUE
ENCODING = 'string'
- Use
Data loading only
- Definition
String (constant) that specifies the character set of the source data when loading data into a table.
Character Set
ENCODING
ValueSupported Languages
Notes
Big5
BIG5
Traditional Chinese
EUC-JP
EUCJP
Japanese
EUC-KR
EUCKR
Korean
GB18030
GB18030
Chinese
IBM420
IBM420
Arabic
IBM424
IBM424
Hebrew
ISO-2022-CN
ISO2022CN
Simplified Chinese
ISO-2022-JP
ISO2022JP
Japanese
ISO-2022-KR
ISO2022KR
Korean
ISO-8859-1
ISO88591
Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish
ISO-8859-2
ISO88592
Czech, Hungarian, Polish, Romanian
ISO-8859-5
ISO88595
Russian
ISO-8859-6
ISO88596
Arabic
ISO-8859-7
ISO88597
Greek
ISO-8859-8
ISO88598
Hebrew
ISO-8859-9
ISO88599
Turkish
KOI8-R
KOI8R
Russian
Shift_JIS
SHIFTJIS
Japanese
UTF-8
UTF8
All languages
For loading data from delimited files (CSV, TSV, etc.), UTF-8 is the default. . . For loading data from all other supported file formats (JSON, Avro, etc.), as well as unloading data, UTF-8 is the only supported character set.
UTF-16
UTF16
All languages
UTF-16BE
UTF16BE
All languages
UTF-16LE
UTF16LE
All languages
UTF-32
UTF32
All languages
UTF-32BE
UTF32BE
All languages
UTF-32LE
UTF32LE
All languages
windows-1250
WINDOWS1250
Czech, Hungarian, Polish, Romanian
windows-1251
WINDOWS1251
Russian
windows-1252
WINDOWS1252
Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish
windows-1253
WINDOWS1253
Greek
windows-1254
WINDOWS1254
Turkish
windows-1255
WINDOWS1255
Hebrew
windows-1256
WINDOWS1256
Arabic
- Default
UTF8
Note
Snowflake stores all data internally in the UTF-8 character set. The data is converted into UTF-8 before it is loaded into Snowflake. If the
VALIDATE_UTF8
file format option isTRUE
, Snowflake validates the UTF-8 character encoding in string column data after it is converted from its original character encoding.
TYPE = JSON¶
COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
- Use
Data loading and unloading
- Definition
When loading data, specifies the current compression algorithm for the data file. Snowflake uses this option to detect how an already-compressed data file was compressed so that the compressed data in the file can be extracted for loading.
When unloading data, compresses the data file using the specified compression algorithm.
- Values
Supported Values
Notes
AUTO
When loading data, compression algorithm detected automatically, except for Brotli-compressed files, which cannot currently be detected automatically. When unloading data, files are automatically compressed using the default, which is gzip.
GZIP
BZ2
BROTLI
Must be specified if loading/unloading Brotli-compressed files.
ZSTD
Zstandard v0.8 (and higher) is supported.
DEFLATE
Deflate-compressed files (with zlib header, RFC1950).
RAW_DEFLATE
Raw Deflate-compressed files (without header, RFC1951).
NONE
When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.
- Default
AUTO
DATE_FORMAT = 'string' | AUTO
- Use
Data loading only
- Definition
Applied only when loading JSON data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). Defines the format of date string values in the data files. If a value is not specified or is
AUTO
, the value for the DATE_INPUT_FORMAT parameter is used.This file format option is currently a Preview Feature.
- Default
AUTO
TIME_FORMAT = 'string' | AUTO
- Use
Data loading only
- Definition
Applied only when loading JSON data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). Defines the format of time string values in the data files. If a value is not specified or is
AUTO
, the value for the TIME_INPUT_FORMAT parameter is used.This file format option is currently a Preview Feature.
- Default
AUTO
TIMESTAMP_FORMAT = string' | AUTO
- Use
Data loading only
- Definition
Applied only when loading JSON data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). Defines the format of timestamp string values in the data files. If a value is not specified or is
AUTO
, the value for the TIMESTAMP_INPUT_FORMAT parameter is used.This file format option is currently a Preview Feature.
- Default
AUTO
BINARY_FORMAT = HEX | BASE64 | UTF8
- Use
Data loading only
- Definition
Applied only when loading JSON data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). Defines the encoding format for binary string values in the data files. The option can be used when loading data into binary columns in a table.
This file format option is currently a Preview Feature.
- Default
HEX
TRIM_SPACE = TRUE | FALSE
- Use
Data loading only
- Definition
Applied only when loading JSON data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). Boolean that specifies whether to remove leading and trailing white space from strings.
For example, if your external database software encloses fields in quotes, but inserts a leading space, Snowflake reads the leading space rather than the opening quotation character as the beginning of the field (i.e. the quotation marks are interpreted as part of the string of field data). Set this option to
TRUE
to remove undesirable spaces during the data load.This file format option is currently a Preview Feature.
- Default
FALSE
NULL_IF = ( 'string1' [ , 'string2' , ... ] )
- Use
Data loading only
- Definition
Applied only when loading JSON data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). String used to convert to and from SQL NULL. Snowflake replaces these strings in the data load source with SQL NULL. To specify more than one string, enclose the list of strings in parentheses and use commas to separate each value.
For example:
NULL_IF = ('\\N', 'NULL', 'NUL', '')
Note that this option can include empty strings.
This file format option is currently a Preview Feature.
- Default
\\N
(i.e. NULL, which assumes theESCAPE_UNENCLOSED_FIELD
value is\\
)
FILE_EXTENSION = 'string' | NONE
- Use
Data unloading only
- Definition
Specifies the extension for files unloaded to a stage. Accepts any extension. The user is responsible for specifying a file extension that can be read by any desired software or services.
- Default
null, meaning the file extension is determined by the format type:
.json[compression]
, wherecompression
is the extension added by the compression method, ifCOMPRESSION
is set.
ENABLE_OCTAL = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that enables parsing of octal numbers.
- Default
FALSE
ALLOW_DUPLICATE = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies to allow duplicate object field names (only the last one will be preserved).
- Default
FALSE
STRIP_OUTER_ARRAY = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that instructs the JSON parser to remove outer brackets (i.e.
[ ]
).- Default
FALSE
STRIP_NULL_VALUES = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that instructs the JSON parser to remove object fields or array elements containing
null
values. For example, when set toTRUE
:Before
After
[null]
[]
[null,null,3]
[,,3]
{"a":null,"b":null,"c":123}
{"c":123}
{"a":[1,null,2],"b":{"x":null,"y":88}}
{"a":[1,,2],"b":{"y":88}}
- Default
FALSE
REPLACE_INVALID_CHARACTERS = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to replace invalid UTF-8 characters with the Unicode replacement character (�). The copy option performs a one-to-one character replacement.
This file format option is currently a Preview Feature.
- Values
If set to
TRUE
, Snowflake replaces invalid UTF-8 characters with the Unicode replacement character.If set to
FALSE
, the load operation produces an error when invalid UTF-8 character encoding is detected.- Default
FALSE
IGNORE_UTF8_ERRORS = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether UTF-8 encoding errors produce error conditions. If set to
TRUE
, any invalid UTF-8 sequences are silently replaced with the Unicode characterU+FFFD
(i.e. “replacement character”).Note
This copy option removes all non-UTF-8 characters during the data load, but there is no guarantee of a one-to-one character replacement. We recommend using the REPLACE_INVALID_CHARACTERS copy option instead.
- Default
FALSE
SKIP_BYTE_ORDER_MARK = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to skip the BOM (byte order mark), if present in a data file. A BOM is a character code at the beginning of a data file that defines the byte order and encoding form.
If set to
FALSE
, Snowflake recognizes any BOM in data files, which could result in the BOM either causing an error or being merged into the first column in the table.- Default
TRUE
TYPE = AVRO¶
COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
- Use
Data loading and unloading
- Definition
When loading data, specifies the current compression algorithm for the data file. Snowflake uses this option to detect how an already-compressed data file was compressed so that the compressed data in the file can be extracted for loading.
When unloading data, compresses the data file using the specified compression algorithm.
- Values
Supported Values
Notes
AUTO
When loading data, compression algorithm detected automatically, except for Brotli-compressed files, which cannot currently be detected automatically. When unloading data, files are automatically compressed using the default, which is gzip.
GZIP
BZ2
BROTLI
Must be specified if loading/unloading Brotli-compressed files.
ZSTD
Zstandard v0.8 (and higher) is supported.
DEFLATE
Deflate-compressed files (with zlib header, RFC1950).
RAW_DEFLATE
Raw Deflate-compressed files (without header, RFC1951).
NONE
When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.
- Default
AUTO
TRIM_SPACE = TRUE | FALSE
- Use
Data loading only
- Definition
Applied only when loading Avro data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). Boolean that specifies whether to remove leading and trailing white space from strings.
For example, if your external database software encloses fields in quotes, but inserts a leading space, Snowflake reads the leading space rather than the opening quotation character as the beginning of the field (i.e. the quotation marks are interpreted as part of the string of field data). Set this option to
TRUE
to remove undesirable spaces during the data load.This file format option is currently a Preview Feature.
- Default
FALSE
NULL_IF = ( 'string1' [ , 'string2' , ... ] )
- Use
Data loading only
- Definition
Applied only when loading Avro data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). String used to convert to and from SQL NULL. Snowflake replaces these strings in the data load source with SQL NULL. To specify more than one string, enclose the list of strings in parentheses and use commas to separate each value.
For example:
NULL_IF = ('\\N', 'NULL', 'NUL', '')
Note that this option can include empty strings.
This file format option is currently a Preview Feature.
- Default
\\N
(i.e. NULL, which assumes theESCAPE_UNENCLOSED_FIELD
value is\\
)
TYPE = ORC¶
TRIM_SPACE = TRUE | FALSE
- Use
Data loading only
- Definition
Applied only when loading ORC data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). Boolean that specifies whether to remove leading and trailing white space from strings.
For example, if your external database software encloses fields in quotes, but inserts a leading space, Snowflake reads the leading space rather than the opening quotation character as the beginning of the field (i.e. the quotation marks are interpreted as part of the string of field data). Set this option to
TRUE
to remove undesirable spaces during the data load.This file format option is currently a Preview Feature.
- Default
FALSE
NULL_IF = ( 'string1' [ , 'string2' , ... ] )
- Use
Data loading only
- Definition
Applied only when loading ORC data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). String used to convert to and from SQL NULL. Snowflake replaces these strings in the data load source with SQL NULL. To specify more than one string, enclose the list of strings in parentheses and use commas to separate each value.
For example:
NULL_IF = ('\\N', 'NULL', 'NUL', '')
Note that this option can include empty strings.
This file format option is currently a Preview Feature.
- Default
\\N
(i.e. NULL, which assumes theESCAPE_UNENCLOSED_FIELD
value is\\
)
TYPE = PARQUET¶
COMPRESSION = AUTO | LZO | SNAPPY | NONE
- Use
Data loading and unloading
- Definition
When loading data, specifies the current compression algorithm for columns in the Parquet files.
When unloading data, compresses the data file using the specified compression algorithm.
- Values
Supported Values
Notes
AUTO
When loading data, compression algorithm detected automatically. Supports the following compression algorithms: Brotli, gzip, Lempel–Ziv–Oberhumer (LZO), LZ4, Snappy, or Zstandard v0.8 (and higher). . When unloading data, unloaded files are compressed using the Snappy compression algorithm by default.
LZO
When unloading data, files are compressed using the Snappy algorithm by default. If unloading data to LZO-compressed files, specify this value.
SNAPPY
When unloading data, files are compressed using the Snappy algorithm by default. You can optionally specify this value.
NONE
When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.
- Default
AUTO
SNAPPY_COMPRESSION = TRUE | FALSE
- Use
Data unloading only
AUTO
| Unloaded files are compressed using the Snappy compression algorithm by default.SNAPPY
| May be specified if unloading Snappy-compressed files.NONE
| When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.- Definition
Boolean that specifies whether unloaded file(s) are compressed using the SNAPPY algorithm.
Note
Deprecated. Use
COMPRESSION = SNAPPY
instead.- Limitations
Only supported for data unloading operations.
- Default
TRUE
BINARY_AS_TEXT = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to interpret columns with no defined logical data type as UTF-8 text. When set to
FALSE
, Snowflake interprets these columns as binary data.- Limitations
Only supported for data loading operations.
- Default
TRUE
TRIM_SPACE = TRUE | FALSE
- Use
Data loading only
- Definition
Applied only when loading Parquet data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). Boolean that specifies whether to remove leading and trailing white space from strings.
For example, if your external database software encloses fields in quotes, but inserts a leading space, Snowflake reads the leading space rather than the opening quotation character as the beginning of the field (i.e. the quotation marks are interpreted as part of the string of field data). Set this option to
TRUE
to remove undesirable spaces during the data load.This file format option is currently a Preview Feature.
- Default
FALSE
NULL_IF = ( 'string1' [ , 'string2' , ... ] )
- Use
Data loading only
- Definition
Applied only when loading Parquet data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). String used to convert to and from SQL NULL. Snowflake replaces these strings in the data load source with SQL NULL. To specify more than one string, enclose the list of strings in parentheses and use commas to separate each value.
For example:
NULL_IF = ('\\N', 'NULL', 'NUL', '')
Note that this option can include empty strings.
This file format option is currently a Preview Feature.
- Default
\\N
(i.e. NULL, which assumes theESCAPE_UNENCLOSED_FIELD
value is\\
)
TYPE = XML¶
COMPRESSION = AUTO | GZIP | BZ2 | BROTLI | ZSTD | DEFLATE | RAW_DEFLATE | NONE
- Use
Data loading and unloading
- Definition
When loading data, specifies the current compression algorithm for the data file. Snowflake uses this option to detect how an already-compressed data file was compressed so that the compressed data in the file can be extracted for loading.
When unloading data, compresses the data file using the specified compression algorithm.
- Values
Supported Values
Notes
AUTO
When loading data, compression algorithm detected automatically, except for Brotli-compressed files, which cannot currently be detected automatically. When unloading data, files are automatically compressed using the default, which is gzip.
GZIP
BZ2
BROTLI
Must be specified if loading/unloading Brotli-compressed files.
ZSTD
Zstandard v0.8 (and higher) is supported.
DEFLATE
Deflate-compressed files (with zlib header, RFC1950).
RAW_DEFLATE
Raw Deflate-compressed files (without header, RFC1951).
NONE
When loading data, indicates that the files have not been compressed. When unloading data, specifies that the unloaded files are not compressed.
- Default
AUTO
IGNORE_UTF8_ERRORS = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether UTF-8 encoding errors produce error conditions. If set to
TRUE
, any invalid UTF-8 sequences are silently replaced with Unicode characterU+FFFD
(i.e. “replacement character”).- Default
FALSE
PRESERVE_SPACE = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether the XML parser preserves leading and trailing spaces in element content.
- Default
FALSE
STRIP_OUTER_ELEMENT = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether the XML parser strips out the outer XML element, exposing 2nd level elements as separate documents.
- Default
FALSE
DISABLE_SNOWFLAKE_DATA = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether the XML parser disables recognition of Snowflake semi-structured data tags.
- Default
FALSE
DISABLE_AUTO_CONVERT = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether the XML parser disables automatic conversion of numeric and Boolean values from text to native representation.
- Default
FALSE
SKIP_BYTE_ORDER_MARK = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to skip any BOM (byte order mark) present in an input file. A BOM is a character code at the beginning of a data file that defines the byte order and encoding form.
If set to
FALSE
, Snowflake recognizes any BOM in data files, which could result in the BOM either causing an error or being merged into the first column in the table.- Default
TRUE
TRIM_SPACE = TRUE | FALSE
- Use
Data loading only
- Definition
Applied only when loading XML data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). Boolean that specifies whether to remove leading and trailing white space from strings.
For example, if your external database software encloses fields in quotes, but inserts a leading space, Snowflake reads the leading space rather than the opening quotation character as the beginning of the field (i.e. the quotation marks are interpreted as part of the string of field data). Set this option to
TRUE
to remove undesirable spaces during the data load.This file format option is currently a Preview Feature.
- Default
FALSE
NULL_IF = ( 'string1' [ , 'string2' , ... ] )
- Use
Data loading only
- Definition
Applied only when loading XML data into separate columns (i.e. using the MATCH_BY_COLUMN_NAME copy option or a COPY transformation). String used to convert to and from SQL NULL. Snowflake replaces these strings in the data load source with SQL NULL. To specify more than one string, enclose the list of strings in parentheses and use commas to separate each value.
For example:
NULL_IF = ('\\N', 'NULL', 'NUL', '')
Note that this option can include empty strings.
This file format option is currently a Preview Feature.
- Default
\\N
(i.e. NULL, which assumes theESCAPE_UNENCLOSED_FIELD
value is\\
)
Copy Options (copyOptions
)¶
You can specify one or more of the following copy options (separated by blank spaces, commas, or new lines):
ON_ERROR = CONTINUE | SKIP_FILE | SKIP_FILE_num | SKIP_FILE_num% | ABORT_STATEMENT
- Use
Data loading only
- Definition
String (constant) that specifies the action to perform when an error is encountered while loading data from a file:
- Values
Supported Values
Notes
CONTINUE
Continue loading the file. The COPY statement returns an error message for a maximum of one error encountered per data file. Note that the difference between the ROWS_PARSED and ROWS_LOADED column values represents the number of rows that include detected errors. However, each of these rows could include multiple errors. To view all errors in the data files, use the VALIDATION_MODE parameter or query the VALIDATE function.
SKIP_FILE
Skip file if any errors encountered in the file.
SKIP_FILE_num
(e.g.SKIP_FILE_10
)Skip file when the number of errors in the file is equal to or exceeds the specified number.
SKIP_FILE_num%
(e.g.SKIP_FILE_10%
)Skip file when the percentage of errors in the file exceeds the specified percentage.
ABORT_STATEMENT
Abort the load operation if any error is encountered in a data file. Note that the load operation is not aborted if the data file cannot be found (e.g. because it does not exist or cannot be accessed).
Note
The following limitations currently apply:
All ON_ERROR values work as expected when loading structured delimited data files (CSV, TSV, etc.) for both parsing and transformation errors.
Semi-structured data files (JSON, Avro, ORC, Parquet, or XML) currently do not support the same behavior semantics as structured data files for the following ON_ERROR values:
CONTINUE
,SKIP_FILE_num
, orSKIP_FILE_num%
due to the design of those formats.Parquet and ORC data only. When ON_ERROR is set to
CONTINUE
,SKIP_FILE_num
, orSKIP_FILE_num%
, any parsing error results in the data file being skipped. Any conversion or transformation errors use the default behavior of COPY (ABORT_STATEMENT
) or Snowpipe (SKIP_FILE
) regardless of selected option value.JSON, XML, and Avro data only. When ON_ERROR is set to
CONTINUE
,SKIP_FILE_num
, orSKIP_FILE_num%
, the records up to the parsing error location are loaded while the remainder of the data file will be skipped. Any conversion or transformation errors use the default behavior of COPY (ABORT_STATEMENT
) or Snowpipe (SKIP_FILE
) regardless of selected option value.
- Default
- Bulk loading using COPY
ABORT_STATEMENT
- Snowpipe
SKIP_FILE
SIZE_LIMIT = num
- Use
Data loading only
- Definition
Number (> 0) that specifies the maximum size (in bytes) of data to be loaded for a given COPY statement. When the threshold is exceeded, the COPY operation discontinues loading files. This option is commonly used to load a common group of files using multiple COPY statements. For each statement, the data load continues until the specified
SIZE_LIMIT
is exceeded, before moving on to the next statement.For example, suppose a set of files in a stage path were each 10 MB in size. If multiple COPY statements set SIZE_LIMIT to
25000000
(25 MB), each would load 3 files. That is, each COPY operation would discontinue after theSIZE_LIMIT
threshold was exceeded.Note that at least one file is loaded regardless of the value specified for
SIZE_LIMIT:code:
unless there is no file to be loaded.- Default
null (no size limit)
PURGE = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to remove the data files from the stage automatically after the data is loaded successfully.
If this option is set to
TRUE
, note that a best effort is made to remove successfully loaded data files. If the purge operation fails for any reason, no error is returned currently. We recommend that you list staged files periodically (using LIST) and manually remove successfully loaded files, if any exist.- Default
FALSE
RETURN_FAILED_ONLY = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies whether to return only files that have failed to load in the statement result.
- Default
FALSE
MATCH_BY_COLUMN_NAME = CASE_SENSITIVE | CASE_INSENSITIVE | NONE
- Use
Data loading only
- Definition
String that specifies whether to load semi-structured data into columns in the target table that match corresponding columns represented in the data.
This copy option is supported for the following data formats:
JSON
Avro
ORC
Parquet
For a column to match, the following criteria must be true:
The column represented in the data must have the exact same name as the column in the table. The copy option supports case sensitivity for column names. Column order does not matter.
The column in the table must have a data type that is compatible with the values in the column represented in the data. For example, string, number, and Boolean values can all be loaded into a variant column.
- Values
CASE_SENSITIVE
|CASE_INSENSITIVE
Load semi-structured data into columns in the target table that match corresponding columns represented in the data. Column names are either case-sensitive (
CASE_SENSITIVE
) or case-insensitive (CASE_INSENSITIVE
).The COPY operation verifies that at least one column in the target table matches a column represented in the data files. If a match is found, the values in the data files are loaded into the column or columns. If no match is found, a set of NULL values for each record in the files is loaded into the table.
Note
If additional non-matching columns are present in the data files, the values in these columns are not loaded.
If additional non-matching columns are present in the target table, the COPY operation inserts NULL values into these columns. These columns must support NULL values.
The COPY statement does not allow specifying a query to further transform the data during the load (i.e. COPY transformation).
NONE
The COPY operation loads the semi-structured data into a variant column or, if a query is included in the COPY statement, transforms the data.
Note
The following limitations currently apply:
MATCH_BY_COLUMN_NAME cannot be used with the
VALIDATION_MODE
parameter in a COPY statement to validate the staged data rather than load it into the target table.Parquet data only. When MATCH_BY_COLUMN_NAME is set to
CASE_SENSITIVE
orCASE_INSENSITIVE
, an empty column value (e.g."col1": ""
) produces an error.
- Default
NONE
ENFORCE_LENGTH = TRUE | FALSE
- Use
Data loading only
- Definition
Alternative syntax for
TRUNCATECOLUMNS
with reverse logic (for compatibility with other systems)Boolean that specifies whether to truncate text strings that exceed the target column length:
If
TRUE
, the COPY statement produces an error if a loaded string exceeds the target column length.If
FALSE
, strings are automatically truncated to the target column length.
Currently, this copy option supports CSV data only.
Note
If the length of the target string column is set to the maximum (e.g.
VARCHAR (16777216)
), an incoming string cannot exceed this length; otherwise, the COPY command produces an error.This parameter is functionally equivalent to
TRUNCATECOLUMNS
, but has the opposite behavior. It is provided for compatibility with other databases. It is only necessary to include one of these two parameters in a COPY statement to produce the desired output.
- Default
TRUE
TRUNCATECOLUMNS = TRUE | FALSE
- Use
Data loading only
- Definition
Alternative syntax for
ENFORCE_LENGTH
with reverse logic (for compatibility with other systems)Boolean that specifies whether to truncate text strings that exceed the target column length:
If
TRUE
, strings are automatically truncated to the target column length.If
FALSE
, the COPY statement produces an error if a loaded string exceeds the target column length.
Currently, this copy option supports CSV data only.
Note
If the length of the target string column is set to the maximum (e.g.
VARCHAR (16777216)
), an incoming string cannot exceed this length; otherwise, the COPY command produces an error.This parameter is functionally equivalent to
ENFORCE_LENGTH
, but has the opposite behavior. It is provided for compatibility with other databases. It is only necessary to include one of these two parameters in a COPY statement to produce the desired output.
- Default
FALSE
FORCE = TRUE | FALSE
- Use
Data loading only
- Definition
Boolean that specifies to load all files, regardless of whether they’ve been loaded previously and have not changed since they were loaded. Note that this option reloads files, potentially duplicating data in a table.
- Default
FALSE
Usage Notes¶
URL
,STORAGE_INTEGRATION
,CREDENTIALS
, andENCRYPTION
only apply to external stages. If you are creating an internal stage, you can ignore these parameters.CREATE STAGE does not check whether the specified URL or credentials are valid. If the credentials are not valid, when you attempt to use the stage, the system returns an error.
If referencing a file format in the current namespace, you can omit the single quotes around the format identifier.
Examples¶
Internal Stages¶
Create an internal stage named my_int_stage
with the default file format type (CSV
):
All the corresponding default CSV file format options are used.
All the default copy options are used, except for
ON_ERROR
. If a COPY INTO <table> command that references this stage encounters a data error on any of the records, it skips the file.CREATE OR REPLACE STAGE my_int_stage COPY_OPTIONS = (ON_ERROR='skip_file');
Create a temporary internal stage named my_temp_int_stage
with all the same properties as the previous example, except the copy option to skip files on error:
CREATE OR REPLACE TEMPORARY STAGE my_temp_int_stage;
Create a temporary internal stage named my_int_stage
that references a file format named my_csv_format
(created using CREATE FILE FORMAT):
CREATE OR REPLACE TEMPORARY STAGE my_int_stage FILE_FORMAT = my_csv_format;When you reference the stage in a COPY INTO <table> statement, the file format options are automatically set
External Stages¶
Amazon S3
Create an external stage named
my_ext_stage
using a private/protected S3 bucket namedload
with a folder path namedfiles
. Secure access to the S3 bucket is provided via themyint
storage integration:CREATE OR REPLACE STAGE my_ext_stage URL='s3://load/files/' STORAGE_INTEGRATION = myint;Create an external stage named
my_ext_stage1
using a private/protected S3 bucket namedload
with a folder path namedfiles
. The Snowflake access permissions for the S3 bucket are associated with an IAM user; therefore, IAM credentials are required:CREATE OR REPLACE STAGE my_ext_stage URL='s3://load/files/' CREDENTIALS=(AWS_KEY_ID='1a2b3c' AWS_SECRET_KEY='4x5y6z');Note that the AWS_KEY_ID and AWS_SECRET_KEY values used in this example are for illustration purposes only.
Create an external stage named
my_ext_stage2
using an S3 bucket namedload
with a folder path namedencrypted_files
and client-side encryption (default encryption type) with the master key to decrypt/encrypt files stored in the bucket:CREATE OR REPLACE STAGE my_ext_stage2 URL='s3://load/encrypted_files/' CREDENTIALS=(AWS_KEY_ID='1a2b3c' AWS_SECRET_KEY='4x5y6z') ENCRYPTION=(MASTER_KEY = 'eSxX0jzYfIamtnBKOEOwq80Au6NbSgPH5r4BDDwOaO8=');Create an external stage named
my_ext_stage3
using an S3 bucket namedload
with a folder path namedencrypted_files
and AWS_SSE_KMS server-side encryption with the ID for the master key to decrypt/encrypt files stored in the bucket:CREATE OR REPLACE STAGE my_ext_stage3 URL='s3://load/encrypted_files/' CREDENTIALS=(AWS_KEY_ID='1a2b3c' AWS_SECRET_KEY='4x5y6z') ENCRYPTION=(TYPE='AWS_SSE_KMS' KMS_KEY_ID = 'aws/key');Same example as the immediately preceding example, except that the Snowflake access permissions for the S3 bucket as associated with an IAM role instead of an IAM user. Note that credentials are handled separately from other stage parameters such as
ENCRYPTION
. Support for these other parameters is the same regardless of the credentials used to access your external S3 bucket:CREATE OR REPLACE STAGE my_ext_stage3 URL='s3://load/encrypted_files/' CREDENTIALS=(AWS_ROLE='arn:aws:iam::001234567890:role/mysnowflakerole') ENCRYPTION=(TYPE='AWS_SSE_KMS' KMS_KEY_ID = 'aws/key');
Google Cloud Storage
Create an external stage named
my_ext_stage
using a private/protected GCS bucket namedload
with a folder path namedfiles
. Secure access to the GCS bucket is provided via themyint
storage integration:CREATE OR REPLACE STAGE my_ext_stage URL='gcs://load/files/' STORAGE_INTEGRATION = myint;
Microsoft Azure
Create an external stage named
my_ext_stage
using a private/protected Azure container namedload
with a folder path namedfiles
. Secure access to the container is provided via themyint
storage integration:CREATE OR REPLACE STAGE my_ext_stage URL='azure://myaccount.blob.core.windows.net/load/files/' STORAGE_INTEGRATION = myint;Create an external stage named
mystage
using an Azure storage account namedmyaccount
and a container namedmycontainer
with a folder path namedfiles
and client-side encryption enabled. The stage references a file format namedmy_csv_format
:CREATE OR REPLACE STAGE mystage URL='azure://myaccount.blob.core.windows.net/mycontainer/files/' CREDENTIALS=(AZURE_SAS_TOKEN='?sv=2016-05-31&ss=b&srt=sco&sp=rwdl&se=2018-06-27T10:05:50Z&st=2017-06-27T02:05:50Z&spr=https,http&sig=bgqQwoXwxzuD2GJfagRg7VOS8hzNr3QLT7rhS8OFRLQ%3D') ENCRYPTION=(TYPE='AZURE_CSE' MASTER_KEY = 'kPxX0jzYfIamtnJEUTHwq80Au6NbSgPH5r4BDDwOaO8=') FILE_FORMAT = my_csv_format;(The
AZURE_SAS_TOKEN
andMASTER_KEY
values used in this example are not actual values; they are provided for illustration purposes only.)