SQL functions: Passing in columns that have the upper
, lower
, or trim
collation specifier¶
Attention
This behavior change is in the 2024_02 bundle.
For the current status of the bundle, refer to Bundle History.
The upper
and lower
collation specifiers convert strings to upper or lower case
(respectively) before strings are compared. The trim
collation specifier removes leading and trailing spaces before strings
are compared.
In cases when you pass in columns with these specifiers to some of the SQL functions, the behavior changes in the following ways:
- Before the change:
The LIKE function ignores the
upper
,lower
, andtrim
specifiers, which results in a case-sensitive comparison.In addition, the following functions do not support columns with the
upper
,lower
, andtrim
specifiers:Variants of the LIKE function:
If you pass a column with the
upper
,lower
, ortrim
specifier to these functions, a compilation error occurs.- After the change:
The LIKE function respects the
upper
,lower
, andtrim
specifiers, which results in a case-insensitive comparison.As a result, queries that use the LIKE function might return additional rows (see the example below).
Note that the LIKE function does not support combinations with locale specifiers (for example,
en-upper
).In addition, the following functions now support columns with the
upper
,lower
, andtrim
specifiers.Variants of the LIKE function:
LIKE ALL
LIKE ANY
ILIKE ANY
CONTAINS
ENDSWITH
POSITION
REPLACE
SPLIT
SPLIT_PART
STARTSWITH
Example of the effects of the change on the LIKE function¶
As noted above, if a column has the upper
, lower
, or trim
specifier, queries with the LIKE function might return
additional rows. For example, suppose that a table has a column with the lower
specifier. Suppose that the text in the table
differs in case.
CREATE OR REPLACE TABLE collated_like (
col_a VARCHAR,
col_b VARCHAR COLLATE 'lower'
);
INSERT INTO collated_like VALUES ('abc', 'abc'), ('ABC','ABC');
Before the behavior change, each of the following queries that use the LIKE function return one row with the value 'abc'
:
SELECT * FROM collated_like WHERE col_a LIKE '%b%';
SELECT * FROM collated_like WHERE col_a COLLATE 'lower' LIKE '%b%';
SELECT * FROM collated_like WHERE col_b LIKE '%b%';
+-------+-------+
| COL_A | COL_B |
|-------+-------|
| abc | abc |
+-------+-------+
After the behavior change, the query that does not use the lower
specification column with the LIKE function still returns one
row:
SELECT * FROM collated_like WHERE col_a LIKE '%b%';
+-------+-------+
| COL_A | COL_B |
|-------+-------|
| abc | abc |
+-------+-------+
However, the queries that use the COLLATE function to specify lower
and queries that pass a
column with the lower
specification to the LIKE function return two rows:
SELECT * FROM collated_like WHERE col_a COLLATE 'lower' LIKE '%b%';
SELECT * FROM collated_like WHERE col_b LIKE '%b%';
+-------+-------+
| COL_A | COL_B |
|-------+-------|
| abc | abc |
| ABC | ABC |
+-------+-------+
This is roughly equivalent to ILIKE. To determine if you should expect changes to these queries, you can replace LIKE with ILIKE in these queries.
Preserving the behavior before the change¶
If your columns use the upper
, lower
, or trim
specification and you want to preserve the behavior before the change,
you can use the COLLATE function with an empty specification to indicate that the upper
,
lower
, or trim
specification associated with the column should not be used:
SELECT * FROM collated_like WHERE col_b COLLATE '' LIKE '%b%';
Note
If you are using this approach with the LIKE function, make sure that both the subject and pattern do not have a collation specification applied.
Ref: 1535