SQL data types: Changes to maximum length, output, and error messages (Pending)

Attention

This behavior change is in the 2024_08 bundle.

For the current status of the bundle, refer to Bundle History.

With this behavior change, compiled SQL expressions and some error messages behave as follows:

Before the change:
  • In compiled SQL expressions and error messages, Snowflake explicitly specified the length of the data type (for example, VARCHAR(16777216)).

  • When loading objects larger than 16 MB, an error related to parsing or processing a large string or file is returned (for example, 100069 (22P02): Error parsing JSON: document is too large, max size 16777216 bytes).

After the change:
  • In compiled SQL expressions and error messages, Snowflake omits the length of the data type (for example, VARCHAR).

  • When loading objects larger than 16 MB, an error related to storing a large object is returned (for example, 100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is <actual_size>).

In the past, an error occurred when you attempted to query an object larger than 16 MB (8 MB for BINARY, GEOMETRY, and GEOGRAPHY) on a stage. You can now read and process objects up to 128 MB in size. Although you still can’t load objects larger than 16 MB in size in a column or output them in a result set, you can query objects up to 128 MB in size (64 MB for BINARY, GEOMETRY, and GEOGRAPHY) in files on a stage and reduce the size of the objects before storing the objects in columns.

For more information, see Reducing the size of objects larger than 16 MB before loading.

The new size limit isn’t explicitly exposed in SQL query output or metadata. However, you can implicitly observe the new increased length by creating or reading objects of a larger size, but not storing them. Enabling this feature introduces the following behavior changes:

  • VARCHAR and BINARY types appear without length in the output of GET_DDL, SHOW, and DESCRIBE commands for column expressions, UDFs, and stored procedures.

    For example, VARCHAR is shown instead of VARCHAR(16777216). This change applies only to newly created objects where you haven’t explicitly specified the length in the DDL statement. The change doesn’t apply to existing objects.

  • Some statements that failed before with a maximum size exceeded (or similar) error will now succeed.

    Statements that only load or create, but never store or return, a large value will succeed now.

  • Some statements that before failed with a maximum size exceeded (or similar) error will keep failing, however, with a different error code or message.

    The new error code and message are still related to exceeding the 16 MB limit, but the error can originate from a different part of the execution plan. For example, cannot load value might change to cannot store value or cannot output value.

The first change affects all customers. The second and third changes affect customers who try to load or generate objects larger than 16 MB.

Important

We strongly advise against creating logic that depends on error messages associated with objects larger than 16 MB. Instead, you can build logic that uses the BIT_LENGTH function to check the size of the value.

Changes in metadata

There are behavior changes that affect the following types of operations:

For these types of operations, there are changes in metadata in the results set.

Note

This list is not exhaustive.

Returning metadata for UDFs

For new user-defined functions (UDFs) that use VARCHAR or BINARY values as input or output, changes in the metadata for DDL statements related to UDFs affect the output returned when you call the GET_DDL function, run the DESCRIBE FUNCTION statement, or query the event table. The following example creates a UDF:

CREATE OR REPLACE FUNCTION udf_varchar(g1 VARCHAR)
  RETURNS VARCHAR
  AS $$
    'Hello' || g1
  $$;
Copy

GET_DDL

The metadata returned from a GET_DDL function call changes in the following way:

SELECT GET_DDL('function', 'udf_varchar(VARCHAR)');
Copy
Metadata before the change:
CREATE OR REPLACE FUNCTION "UDF_VARCHAR"("G1" VARCHAR(16777216))
RETURNS VARCHAR(16777216)
LANGUAGE SQL
AS '
  ''Hello'' || g1
';
Metadata after the change:
CREATE OR REPLACE FUNCTION "UDF_VARCHAR"("G1" VARCHAR)
RETURNS VARCHAR
LANGUAGE SQL
AS '
  ''Hello'' || g1
';

DESCRIBE FUNCTION

The metadata returned for a DESCRIBE FUNCTION statement changes in the following way:

DESCRIBE FUNCTION udf_varchar(VARCHAR);
Copy
Metadata before the change:
+-----------+-------------------+
| property  | value             |
|-----------+-------------------|
| signature | (G1 VARCHAR)      |
| returns   | VARCHAR(16777216) |
| language  | SQL               |
| body      |                   |
|           |   'Hello' || g1   |
|           |                   |
+-----------+-------------------+
Metadata after the change:
+-----------+-------------------+
| property  | value             |
|-----------+-------------------|
| signature | (G1 VARCHAR)      |
| returns   | VARCHAR           |
| language  | SQL               |
| body      |                   |
|           |   'Hello' || g1   |
|           |                   |
+-----------+-------------------+

Event table

For new user-defined functions that return VARCHAR or BINARY values as output, the snow.executable.name attribute in the RESOURCE_ATTRIBUTES column of the event table changes as follows:

Metadata before the change:
{
  "db.user": "MYUSERNAME",
  "snow.database.id": 13,
  "snow.database.name": "MY_DB",
  "snow.executable.id": 197,
  "snow.executable.name": "UDF_VARCHAR(X VARCHAR):VARCHAR(16777216)",
  "snow.executable.type": "FUNCTION",
  "snow.owner.id": 2,
  "snow.owner.name": "MY_ROLE",
  "snow.query.id": "01ab0f07-0000-15c8-0000-0129000592c2",
  "snow.schema.id": 16,
  "snow.schema.name": "PUBLIC",
  "snow.session.id": 1275605667850,
  "snow.session.role.primary.id": 2,
  "snow.session.role.primary.name": "MY_ROLE",
  "snow.user.id": 25,
  "snow.warehouse.id": 5,
  "snow.warehouse.name": "MYWH",
  "telemetry.sdk.language": "python"
}
Copy
Metadata after the change:
{
  "db.user": "MYUSERNAME",
  "snow.database.id": 13,
  "snow.database.name": "MY_DB",
  "snow.executable.id": 197,
  "snow.executable.name": "UDF_VARCHAR(X VARCHAR):VARCHAR",
  "snow.executable.type": "FUNCTION",
  "snow.owner.id": 2,
  "snow.owner.name": "MY_ROLE",
  "snow.query.id": "01ab0f07-0000-15c8-0000-0129000592c2",
  "snow.schema.id": 16,
  "snow.schema.name": "PUBLIC",
  "snow.session.id": 1275605667850,
  "snow.session.role.primary.id": 2,
  "snow.session.role.primary.name": "MY_ROLE",
  "snow.user.id": 25,
  "snow.warehouse.id": 5,
  "snow.warehouse.name": "MYWH",
  "telemetry.sdk.language": "python"
}
Copy

Returning metadata for tables with column expressions

For new tables that use VARCHAR or BINARY in column expressions, changes in the metadata for DDL statements related to these columns affect the output returned when you call the GET_DDL function.

The following example creates a table with column expression:

CREATE OR REPLACE TABLE table_with_default(x INT, v TEXT DEFAULT x::VARCHAR);
Copy

The metadata returned from a GET_DDL function call changes in the following way:

SELECT GET_DDL('table', 'table_with_default');
Copy
Metadata before the change:
create or replace TABLE TABLE_WITH_DEFAULT ( |
      X NUMBER(38,0),
      V VARCHAR(16777216) DEFAULT CAST(TABLE_WITH_DEFAULT.X AS VARCHAR(16777216))
);
Metadata after the change:
create or replace TABLE TABLE_WITH_DEFAULT ( |
      X NUMBER(38,0),
      V VARCHAR(16777216) DEFAULT CAST(TABLE_WITH_DEFAULT.X AS VARCHAR)
);

External tables

The following example creates an external table:

CREATE OR REPLACE EXTERNAL TABLE ext_table(
    data_str VARCHAR AS (value:c1::TEXT))
  LOCATION = @csv_stage
  AUTO_REFRESH = false
  FILE_FORMAT =(type = csv);
Copy

The metadata returned from a GET_DDL function call changes in the following way:

SELECT GET_DDL('table', 'ext_table');
Copy
Metadata before the change:
create or replace external table EXT_TABLE(
      DATA_STR VARCHAR(16777216) AS (CAST(GET(VALUE, 'c1') AS VARCHAR(16777216))))
location=@CSV_STAGE/
auto_refresh=false
file_format=(TYPE=csv)
;
Metadata after the change:
create or replace external table EXT_TABLE(
      DATA_STR VARCHAR(16777216) AS (CAST(GET(VALUE, 'c1') AS VARCHAR)))
location=@CSV_STAGE/
auto_refresh=false
file_format=(TYPE=csv)
;

Returning metadata for SYSTEM$TYPEOF

The metadata returned for a call to the SYSTEM$TYPEOF function changes in the following way:

SELECT SYSTEM$TYPEOF(REPEAT('a',10));
Copy
Metadata before the change:
VARCHAR(16777216)[LOB]
Metadata after the change:
VARCHAR[LOB]

Returning metadata for SHOW COLUMNS

This change affects both existing and new tables. The metadata returned for a SHOW COLUMNS statement changes in the following way:

CREATE OR REPLACE TABLE t AS
  SELECT TO_VARIANT('abc') AS col;

SHOW COLUMNS IN t;
Copy
Metadata before the change:
{
  "type":"VARIANT",
  "length":16777216,
  "byteLength":16777216,
  "nullable":true,
  "fixed":false
}
Metadata after the change:
{
  "type":"VARIANT",
  "nullable":true,
  "fixed":false
}

Changes in loading and processing objects larger than 16 MB

There are behavior changes that affect cases when you try to load or process objects larger than 16 MB using the following types of operations:

Note

This list is not exhaustive.

Loading data by scanning files on a stage

When you attempt to load data larger than 16 MB by scanning files on a stage, an error message is returned.

Loading a whole large object using CREATE TABLE AS SELECT

A different error message appears when you try to use a CREATE TABLE AS SELECT statement to load objects that are larger than 16 MB for VARCHAR, VARIANT, OBJECT, and ARRAY (or larger than 8 MB for BINARY, GEOMETRY, or GEOGRAPHY). The error depends on the type of the source. The same message change applies when an INSERT INTO SELECT statement is used for this scenario.

Loading a whole large object from a JSON source

The following example tries to load a whole object larger than 16 MB from a JSON source using CREATE TABLE AS SELECT:

CREATE OR REPLACE FILE FORMAT json_format TYPE = JSON;

CREATE OR REPLACE TABLE table_varchar (lob_column VARCHAR) AS
  SELECT $1::VARCHAR
    FROM @lob_int_stage/driver_status.json.gz (FILE_FORMAT => 'json_format');
Copy
Error message before the change:
100069 (22P02): Error parsing JSON: document is too large, max size 16777216 bytes
Error message after the change:
100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is <actual_size>
Loading a whole large object from an XML source

The following example tries to load a whole object larger than 16 MB from an XML source using CREATE TABLE AS SELECT:

CREATE or REPLACE FILE FORMAT xml_format TYPE = XML;

CREATE OR REPLACE TABLE table_varchar (lob_column VARCHAR) AS
  SELECT $1 AS XML
    FROM @lob_int_stage/large_xml.xte (FILE_FORMAT => 'xml_format');
Copy
Error message before the change:
100100 (22P02): Error parsing XML: document is too large, max size 16777216 bytes
Error message after the change:
100078 (22000): String '<string_preview>' is too long and would be truncated

Loading a whole large object using COPY INTO <table_name> … FROM SELECT

A different error message appears when you try to use a COPY INTO <table_name> … FROM SELECT statement to load objects that are larger than 16 MB for VARCHAR, VARIANT, OBJECT, and ARRAY (or larger than 8 MB for BINARY, GEOMETRY, or GEOGRAPHY). The error depends on the type of the source.

Important

If you attempt to load data that contains objects larger than 16 MB using the COPY INTO command with ON_ERROR=CONTINUE and rely on the error messages written in the error log, the change in the error message could cause problems in logic that depends on the error message.

Loading a whole large object from a JSON source

The following example tries to load a whole object larger than 16 MB from a JSON source using COPY INTO <table_name> … FROM SELECT:

CREATE OR REPLACE TABLE table_varchar (lob_column VARCHAR);

COPY INTO table_varchar FROM (
  SELECT $1::VARCHAR
    FROM @lob_int_stage/driver_status.json.gz (FILE_FORMAT => 'json_format'));
Copy
Error message before the change:
100069 (22P02): Error parsing JSON: document is too large, max size 16777216 bytes
Error message after the change:
100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is <actual_size>
Loading large nested objects from a JSON source

The following example tries to load JSON data when accessing large nested objects:

CREATE OR REPLACE TABLE table_varchar (lob_column VARCHAR);

COPY INTO table_varchar FROM (
  SELECT $1:"Driver_Status"
    FROM @lob_int_stage/driver_status.json.gz (FILE_FORMAT => 'json_format'));
Copy
Error message before the change:
100069 (22P02): Max LOB size (16777216) exceeded, actual size of parsed column is <object_size>
Error message after the change:
100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is <actual_size>
Loading a whole large object from an XML source

The following example tries to load a whole object larger than 16 MB from an XML source using COPY INTO <table_name> … FROM SELECT:

CREATE OR REPLACE TABLE table_varchar (lob_column VARCHAR);

COPY INTO table_varchar FROM (
  SELECT $1::VARCHAR AS lob_column
    FROM @lob_int_stage/large_xml.xte (FILE_FORMAT => 'xml_format'));
Copy
Error message before the change:
100100 (22P02): Error parsing XML: document is too large, max size 16777216 bytes
Error message after the change:
100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is <object_size>

Loading a whole large object using COPY INTO <table_name> … FROM <stage_or_location>

A different error message appears when you try to use a COPY INTO <table_name> … FROM <stage_or_location> statement to load objects that are larger than 16 MB for VARCHAR, VARIANT, OBJECT, and ARRAY (or larger than 8 MB for BINARY, GEOMETRY, or GEOGRAPHY). The error depends on the type of the source.

If you use the COPY command with large objects, queries might fail even when the ON_ERROR parameter is set to CONTINUE. For more information, see the usage notes for the COPY command.

Important

If you attempt to load data that contains objects larger than 16 MB using the COPY INTO command with ON_ERROR=CONTINUE and rely on the error messages written in the error log, the change in the error message could cause problems in logic that depends on the message.

Loading a whole large object from a JSON source

The following example tries to load a whole object larger than 16 MB from a JSON source using COPY INTO <table_name> … FROM <stage_or_location>:

CREATE OR REPLACE TABLE table_varchar (lob_column VARCHAR);

COPY INTO table_varchar (lob_column)
  FROM @lob_int_stage/driver_status.json.gz
  FILE_FORMAT = (FORMAT_NAME = json_format);
Copy
Error message before the change:
100069 (22P02): Error parsing JSON: document is too large, max size 16777216 bytes
Error message after the change:
100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is <actual_size>
Loading a whole large object from an XML source

The following example tries to load a whole object larger than 16 MB from an XML source using COPY INTO <table_name> … FROM <stage_or_location>:

CREATE OR REPLACE TABLE table_varchar (lob_column VARCHAR);

COPY INTO table_varchar (lob_column)
  FROM @lob_int_stage/large_xml.xte
  FILE_FORMAT = (FORMAT_NAME = xml_format);
Copy
Error message before the change:
100100 (22P02): Error parsing XML: document is too large, max size 16777216 bytes
Error message after the change:
100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is <actual_size>

Querying a whole large object from a source file

Because objects larger than 16 MB currently are not allowed in a result set, a different error message appears when you try to query objects from a source file that are larger than 16 MB for VARCHAR, VARIANT, OBJECT, and ARRAY (or larger than 8 MB for BINARY, GEOMETRY, or GEOGRAPHY). The error depends on the type of the source.

Querying a whole large object from a JSON source

The following example tries to query a whole object larger than 16 MB from a JSON source:

SELECT $1
  FROM @lob_int_stage/driver_status.json.gz (FILE_FORMAT => 'json_format');
Copy
Error message before the change:
100069 (22P02): Error parsing JSON: document is too large, max size 16777216 bytes
Error message after the change:
100082 (22000): The data length in result column $1 is not supported by this version of the client. Actual length <actual_length> exceeds supported length of 16777216.

Querying a whole large object from an XML source

The following example tries to query a whole object larger than 16 MB from an XML source:

SELECT $1 as lob_column
  FROM @lob_int_stage/large_xml.xte (FILE_FORMAT => 'xml_format');
Copy
Error message before the change:
100100 (22P02): Error parsing XML: document is too large, max size 16777216 bytes
Error message after the change:
100082 (22000): The data length in result column $1 is not supported by this version of the client. Actual length <actual_length> exceeds supported length of 16777216.

Querying a whole large object from a CSV source

The following example tries to query a whole object larger than 16 MB from a CSV source:

SELECT $1
  FROM @lob_int_stage/driver_status.csv.gz (FILE_FORMAT => 'csv_format');
Copy
Error message before the change:
100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is <object_size>
Error message after the change:
100082 (22000): The data length in result column $1 is not supported by this version of the client. Actual length <actual_length> exceeds supported length of 16777216.

Querying a whole large object from a Parquet source

The following example tries to query a whole object larger than 16 MB from a Parquet source:

SELECT $1
  FROM @lob_int_stage/driver_status.parquet (FILE_FORMAT => 'parquet_format');
Copy
Error message before the change:
100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is <object_size>
Error message after the change:
100082 (22000): The data length in result column $1 is not supported by this version of the client. Actual length <actual_length> exceeds supported length of 16777216.

Including large objects in query results

You can now create objects larger than 16 MB in memory. However, you cannot include these objects in query results or store them in a table. When you attempt to do so, an error message is returned.

Attempting to include an object larger than 16 MB in query results

The following query attempts to concatenate two large strings:

SELECT large_str || large_str FROM lob_strings;
Copy
Error message before the change:
100078 (22000): String '<preview_of_string>' is too long and would be truncated in 'CONCAT'
Error message after the change:
100067 (54000): The data length in result column <column_name> is not supported by this version of the client. Actual length <actual_size> exceeds supported length of 16777216.

Attempting to store an object larger than 16 MB in a table

The following CREATE TABLE AS SELECT statement attempts to concatenate two large strings:

CREATE OR REPLACE TABLE table_varchar
  AS SELECT large_str || large_str as LOB_column
  FROM lob_strings;
Copy
Error message before the change:
100078 (22000): String '<preview_of_string>' is too long and would be truncated in 'CONCAT'
Error message after the change:
100067 (54000): The data length in result column <column_name> is not supported by this version of the client. Actual length <actual_size> exceeds supported length of 16777216.

Creating a large object using aggregation

When you try to create an object larger than 16 MB and return output for it, an error message is returned.

The following example uses the ARRAY_AGG function in a query of a large object column:

SELECT ARRAY_AGG(status) FROM lob_object;
Copy
Error message before the change:
100082 (22000): Max LOB size (16777216) exceeded, actual size of parsed column is <actual_size>
Error message after the change:
100067 (54000): The data length in result column <column_name> is not supported by this version of the client. Actual length <actual_size> exceeds supported length of 16777216.

Ref: 1779