Categories:

String Functions (Regular Expressions)

# REGEXP_SUBSTR_ALL¶

Returns an ARRAY that contains all substrings that match a regular expression within a string. If no match is found, the function returns an empty ARRAY.

Aliases:

REGEXP_EXTRACT_ALL

## Syntax¶

REGEXP_SUBSTR_ALL( <subject> , <pattern> [ , <position> [ , <occurrence> [ , <regex_parameters> [ , <group_num> ] ] ] ] )


## Arguments¶

Required:

subject

The string to search for matches.

pattern

Pattern to match.

For guidelines on specifying patterns, see String Functions (Regular Expressions).

Optional:

position

Number of characters from the beginning of the string where the function starts searching for matches.

Default: 1 (the search for a match starts at the first character on the left)

occurrence

Specifies the first occurrence of the pattern from which to start returning matches. The function skips the first occurrence - 1 matches.

For example, if there are 5 matches and you specify 3 for the occurrence argument, the function ignores the first two matches and returns the third, fourth, and fifth matches.

Default: 1

regex_parameters

String of one or more characters that specifies the regular expression parameters used to search for matches. The supported values are:

• c: case-sensitive.

• i: case-insensitive.

• m: multi-line mode.

• e: extract sub-matches.

• s: the ‘.’ wildcard also matches newline.

For more details, see the regular expression parameters documentation.

Default: c

Note

By default, REGEXP_SUBSTR_ALL returns the entire matching part of the subject. However, if the e (for “extract”) parameter is specified, REGEXP_SUBSTR_ALL returns the part of the subject that matches the first group in the pattern. If e is specified but a group_num is not also specified, then the group_num defaults to 1 (the first group). If there is no sub-expression in the pattern, REGEXP_SUBSTR_ALL behaves as if e was not set. For examples that use e, see Examples in this topic.

group_num

The group_num parameter specifies which group to extract. Groups are specified by using parentheses in the regular expression.

If a group_num is specified, Snowflake allows extraction even if the 'e' option was not also specified. The 'e' is implied.

Snowflake supports up to 1024 groups.

For examples that use group_num, see the Examples in this topic.

## Returns¶

The function returns a value of type ARRAY. The ARRAY contains an element for each matching substring.

If you specify group_num and the pattern does not specify a grouping with that number, the function returns NULL. For example, if the pattern specifies only one group (e.g. a(b)c) and you use 2 as group_num, the function returns NULL.

## Usage Notes¶

For additional information on using regular expressions, see String Functions (Regular Expressions).

## Collation Details¶

Arguments with collation specifications are currently not supported.

## Examples¶

The pattern in the following example matches a lowercase “a” followed by a digit. The example returns an ARRAY that contains all of the matches:

select regexp_substr_all('a1_a2a3_a4A5a6', 'a[[:digit:]]') as matches;
+---------+
| MATCHES |
|---------|
| [       |
|   "a1", |
|   "a2", |
|   "a3", |
|   "a4", |
|   "a6"  |
| ]       |
+---------+


The following example starts finding matches from the second character in the string (2):

select regexp_substr_all('a1_a2a3_a4A5a6', 'a[[:digit:]]', 2) as matches;
+---------+
| MATCHES |
|---------|
| [       |
|   "a2", |
|   "a3", |
|   "a4", |
|   "a6"  |
| ]       |
+---------+


The following example starts returning matches from the third occurrence of the pattern in the string (3):

select regexp_substr_all('a1_a2a3_a4A5a6', 'a[[:digit:]]', 1, 3) as matches;
+---------+
| MATCHES |
|---------|
| [       |
|   "a3", |
|   "a4", |
|   "a6"  |
| ]       |
+---------+


The following example performs a case-insensitive match (i):

select regexp_substr_all('a1_a2a3_a4A5a6', 'a[[:digit:]]', 1, 1, 'i') as matches;
+---------+
| MATCHES |
|---------|
| [       |
|   "a1", |
|   "a2", |
|   "a3", |
|   "a4", |
|   "A5", |
|   "a6"  |
| ]       |
+---------+


The following example performs a case-insensitive match and returns the part of the string that matches the first group (ie):

select regexp_substr_all('a1_a2a3_a4A5a6', '(a)([[:digit:]])', 1, 1, 'ie') as matches;
+---------+
| MATCHES |
|---------|
| [       |
|   "a",  |
|   "a",  |
|   "a",  |
|   "a",  |
|   "A",  |
|   "a"   |
| ]       |
+---------+


The following example demonstrates that the function returns an empty ARRAY when no matches are found:

select regexp_substr_all('a1_a2a3_a4A5a6', 'b') as matches;
+---------+
| MATCHES |
|---------|
| []      |
+---------+