- Categories:
String & binary functions (Matching/Comparison)
SUBSTR , SUBSTRING¶
Returns the portion of the string or binary value from base_expr
, starting from the character/byte specified by start_expr
,
with optionally limited length.
These functions are synonymous.
Syntax¶
SUBSTR( <base_expr>, <start_expr> [ , <length_expr> ] )
SUBSTRING( <base_expr>, <start_expr> [ , <length_expr> ] )
Arguments¶
base_expr
This must be a
VARCHAR
orBINARY
value.start_expr
The start position should be an expression that evaluates to an integer. It specifies the offset from which the substring starts. The offset is measured in:
The number of UTF-8 characters if the input is
VARCHAR
.The number of bytes if the input is
BINARY
.
The start position is 1-based, not 0-based.
SUBSTR('abc', 1, 1)
returns ‘a’, not ‘b’.length_expr
The length should be an expression that evaluates to an integer. It should specify:
The number of UTF-8 characters to return if the input is
VARCHAR
.The number of bytes to return if the input is
BINARY
.
The length should be greater than or equal to zero. If the length is a negative number, the function returns an empty string.
Returns¶
The data type of the returned value is the same as the data type of the base_expr
(BINARY
or VARCHAR
).
Usage notes¶
If
length_expr
is used, up tolength_expr
characters/bytes are returned, otherwise all the characters until the end of the string or binary value are returned.The values in
start_expr
start from 1:If 0 is specified, it is treated as 1.
If a negative value is specified, the starting position is computed as the
start_expr
characters/bytes from the end of the string or binary value. If the position is outside of the range of a string or binary value, an empty value is returned.
If any of the inputs are NULL, NULL is returned.
Collation details¶
Collation applies to
VARCHAR
inputs. Collation does not apply if the input data type of the first parameter isBINARY
.No impact. Although collation is accepted syntactically, collations have no impact on processing. For example, languages with two-character and three-character letters (e.g. “dzs” in Hungarian, “ch” in Czech) still count those as two or three characters (not one character) for the length argument.
The collation of the result is the same as the collation of the input. This can be useful if the returned value is passed to another function as part of nested function calls.
Examples¶
SELECT SUBSTR('testing 1 2 3', 9, 5) FROM x;
-------------------------------+
substr('testing 1 2 3', 9, 5) |
-------------------------------+
1 2 3 |
-------------------------------+
SELECT '123456', pos, len, SUBSTR('123456', pos, len) FROM o;
----------+--------+-----+----------------------------+
'123456' | pos | len | substr('123456', pos, len) |
----------+--------+-----+----------------------------+
123456 | -1 | 3 | 6 |
123456 | -3 | 3 | 456 |
123456 | -3 | 7 | 456 |
123456 | -5 | 3 | 234 |
123456 | -7 | 3 | |
123456 | 0 | 3 | 123 |
123456 | 0 | 7 | 123456 |
123456 | 1 | 3 | 123 |
123456 | 3 | 3 | 345 |
123456 | 3 | 7 | 3456 |
123456 | 5 | 3 | 56 |
123456 | 5 | 7 | 56 |
123456 | 7 | 3 | |
123456 | [NULL] | 3 | [NULL] |
123456 | [NULL] | 7 | [NULL] |
----------+--------+-----+----------------------------+