Categories:

String Functions (Regular Expressions)

REGEXP

Returns true if the subject matches the specified pattern. Both inputs must be text expressions.

Aliases

RLIKE (2nd syntax)

See also String Functions (Regular Expressions).

Syntax

<subject> REGEXP <pattern>

Usage Notes

  • The function implicitly anchors a pattern at both ends (i.e. '' automatically becomes '^$', and 'ABC' automatically becomes '^ABC$'). To match any string starting with ABC, the pattern would be 'ABC.*'.

  • For more usage notes, see the General Usage Notes for regular expression functions.

Collation Details

Arguments with collation specifications are currently not supported.

Examples

The example below shows how to use REGEXP with a simple wildcard expression:

Create the table and load data:

CREATE OR REPLACE TABLE strings (v VARCHAR(50));
INSERT INTO strings (v) VALUES
    ('San Francisco'),
    ('San Jose'),
    ('Santa Clara'),
    ('Sacramento');

Use wildcards to search for a pattern:

SELECT v
    FROM strings
    WHERE v REGEXP 'San* [fF].*'
    ORDER BY v;
+---------------+
| V             |
|---------------|
| San Francisco |
+---------------+

The backslash character \ is the escape character in regular expressions, and specifies special characters or groups of characters. For example, \s is the regular expression for whitespace.

The Snowflake string parser, which parses literal strings, also treats backslash as an escape character. For example, a backslash is used as part of the sequence of characters that specifies a tab character. Thus to create a string that contains a single backslash, you must specify two backslashes. For example, compare the string in the input statement below with the corresponding string in the output:

INSERT INTO strings (v) VALUES
    ('Contains embedded single \\backslash')
    ;
SELECT * 
    FROM strings
    ORDER BY v;
+-------------------------------------+
| V                                   |
|-------------------------------------|
| Contains embedded single \backslash |
| Sacramento                          |
| San Francisco                       |
| San Jose                            |
| Santa Clara                         |
+-------------------------------------+

This example shows how to search for strings that start with “San”, where “San” is a complete word (e.g. not part of “Santa”). \b is the escape sequence for a word boundary.

select v, v regexp 'San\\b.*' AS MATCHES
    from strings
    order by v;
+-------------------------------------+---------+
| V                                   | MATCHES |
|-------------------------------------+---------|
| Contains embedded single \backslash | False   |
| Sacramento                          | False   |
| San Francisco                       | True    |
| San Jose                            | True    |
| Santa Clara                         | False   |
+-------------------------------------+---------+

This example shows how to search for a blank followed by a backslash. Note that the single backslash to search for is represented by four backslashes below; for REGEXP to look for a literal backslash, that backslash must be escaped, so you need two backslashes. The string parser requires that each of those backslashes be escaped, so the expression contains four backslashes to represent the one backslash that the expression is searching for:

select v, v regexp '.*\\s\\\\.*' AS MATCHES
    from strings
    order by v;
+-------------------------------------+---------+
| V                                   | MATCHES |
|-------------------------------------+---------|
| Contains embedded single \backslash | True    |
| Sacramento                          | False   |
| San Francisco                       | False   |
| San Jose                            | False   |
| Santa Clara                         | False   |
+-------------------------------------+---------+

The following example is the same as the preceding example, except that it uses “$$” as a string delimiter to tell the string parser that the string is a literal and that backslashes should not be interpreted as escape sequences. (The backslashes are still interpreted as escape sequences by REGEXP.)

select v, v regexp $$.*\s\\.*$$ AS MATCHES
    from strings
    order by v;
+-------------------------------------+---------+
| V                                   | MATCHES |
|-------------------------------------+---------|
| Contains embedded single \backslash | True    |
| Sacramento                          | False   |
| San Francisco                       | False   |
| San Jose                            | False   |
| Santa Clara                         | False   |
+-------------------------------------+---------+