SQL Changes — General: Correctly set byteLength for VARCHAR string columns (Pending)

Attention

This behavior change is in the 2026_03 bundle.

For the current status of the bundle, refer to Bundle history.

This behavior change fixes the byte length calculation for VARCHAR columns to consistently account for UTF-8 encoding (4 bytes per character). Prior to this fix, VARCHAR columns with character lengths greater than 4,194,304 and up to 16,777,216 could have incorrectly calculated byte lengths.

Before the change:

For VARCHAR columns with character length > 4,194,304 and <= 16,777,216, the byteLength was incorrectly capped at 16,777,216 bytes. This did not properly account for UTF-8 encoding, which requires up to 4 bytes per character.

For example:

CREATE TABLE example_table (
  col1 VARCHAR(10000000) -- 10M characters
);
SHOW COLUMNS IN TABLE example_table;

Result:

{
  "length": 10000000,
  "byteLength": 16777216
}

The byteLength should be 40,000,000 (4 x 10,000,000), but was incorrectly capped at 16,777,216.

After the change:

For VARCHAR columns with character length > 4,194,304 and <= 16,777,216, the byteLength is correctly calculated as 4 x character_length, properly accounting for UTF-8 encoding where each character can be up to 4 bytes.

Using the same example:

{
  "length": 10000000,
  "byteLength": 40000000
}

This change only affects new string columns. String columns with character length > 16,777,216 are not affected because byteLength is already correctly set for those cases. The byteLength is still capped at 134,217,728.

Ref: 2286