Categories:

String Functions (Regular Expressions)

REGEXP_SUBSTR_ALL

Returns an ARRAY that contains all substrings that match a regular expression within a string. If no match is found, the function returns an empty ARRAY.

Aliases:

REGEXP_EXTRACT_ALL

See also: String Functions (Regular Expressions)

Syntax

REGEXP_SUBSTR_ALL( <subject> , <pattern> [ , <position> [ , <occurrence> [ , <regex_parameters> [ , <group_num> ] ] ] ] )
Copy

Arguments

Required:

subject

The string to search for matches.

pattern

Pattern to match.

For guidelines on specifying patterns, see String Functions (Regular Expressions).

Optional:

position

Number of characters from the beginning of the string where the function starts searching for matches.

Default: 1 (the search for a match starts at the first character on the left)

occurrence

Specifies the first occurrence of the pattern from which to start returning matches. The function skips the first occurrence - 1 matches.

For example, if there are 5 matches and you specify 3 for the occurrence argument, the function ignores the first two matches and returns the third, fourth, and fifth matches.

Default: 1

regex_parameters

String of one or more characters that specifies the regular expression parameters used to search for matches. The supported values are:

  • c: case-sensitive.

  • i: case-insensitive.

  • m: multi-line mode.

  • e: extract sub-matches.

  • s: the ‘.’ wildcard also matches newline.

For more details, see the regular expression parameters documentation.

Default: c

Note

By default, REGEXP_SUBSTR_ALL returns the entire matching part of the subject. However, if the e (for “extract”) parameter is specified, REGEXP_SUBSTR_ALL returns the part of the subject that matches the first group in the pattern. If e is specified but a group_num is not also specified, then the group_num defaults to 1 (the first group). If there is no sub-expression in the pattern, REGEXP_SUBSTR_ALL behaves as if e was not set. For examples that use e, see Examples in this topic.

group_num

The group_num parameter specifies which group to extract. Groups are specified by using parentheses in the regular expression.

If a group_num is specified, Snowflake allows extraction even if the 'e' option was not also specified. The 'e' is implied.

Snowflake supports up to 1024 groups.

For examples that use group_num, see the Examples in this topic.

Returns

The function returns a value of type ARRAY. The ARRAY contains an element for each matching substring.

If you specify group_num and the pattern does not specify a grouping with that number, the function returns NULL. For example, if the pattern specifies only one group (e.g. a(b)c) and you use 2 as group_num, the function returns NULL.

Usage Notes

For additional information on using regular expressions, see String Functions (Regular Expressions).

Collation Details

Arguments with collation specifications are currently not supported.

Examples

The pattern in the following example matches a lowercase “a” followed by a digit. The example returns an ARRAY that contains all of the matches:

select regexp_substr_all('a1_a2a3_a4A5a6', 'a[[:digit:]]') as matches;
+---------+
| MATCHES |
|---------|
| [       |
|   "a1", |
|   "a2", |
|   "a3", |
|   "a4", |
|   "a6"  |
| ]       |
+---------+
Copy

The following example starts finding matches from the second character in the string (2):

select regexp_substr_all('a1_a2a3_a4A5a6', 'a[[:digit:]]', 2) as matches;
+---------+
| MATCHES |
|---------|
| [       |
|   "a2", |
|   "a3", |
|   "a4", |
|   "a6"  |
| ]       |
+---------+
Copy

The following example starts returning matches from the third occurrence of the pattern in the string (3):

select regexp_substr_all('a1_a2a3_a4A5a6', 'a[[:digit:]]', 1, 3) as matches;
+---------+
| MATCHES |
|---------|
| [       |
|   "a3", |
|   "a4", |
|   "a6"  |
| ]       |
+---------+
Copy

The following example performs a case-insensitive match (i):

select regexp_substr_all('a1_a2a3_a4A5a6', 'a[[:digit:]]', 1, 1, 'i') as matches;
+---------+
| MATCHES |
|---------|
| [       |
|   "a1", |
|   "a2", |
|   "a3", |
|   "a4", |
|   "A5", |
|   "a6"  |
| ]       |
+---------+
Copy

The following example performs a case-insensitive match and returns the part of the string that matches the first group (ie):

select regexp_substr_all('a1_a2a3_a4A5a6', '(a)([[:digit:]])', 1, 1, 'ie') as matches;
+---------+
| MATCHES |
|---------|
| [       |
|   "a",  |
|   "a",  |
|   "a",  |
|   "a",  |
|   "A",  |
|   "a"   |
| ]       |
+---------+
Copy

The following example demonstrates that the function returns an empty ARRAY when no matches are found:

select regexp_substr_all('a1_a2a3_a4A5a6', 'b') as matches;
+---------+
| MATCHES |
|---------|
| []      |
+---------+
Copy