> This page location: Postgres guides > Functions > String functions > regexp_match
> Full Neon documentation index: https://neon.com/docs/llms.txt

# Postgres regexp_match() function

Extract substrings matching a regular expression pattern

The Postgres `regexp_match()` function is used to extract substrings that match a regular expression pattern from a given string. It returns an array of matching substrings, including capture groups if specified in the pattern.

Use it for complex string parsing tasks, such as extracting structured information from semi-structured text data. For example, it can be used to parse log files, extract specific components from URLs, or analyze text data for specific patterns.

## Function signature

The `regexp_match()` function has the following form:

```sql
regexp_match(string text, pattern text [, flags text]) -> text[]
```

- `string`: The input string to search for matches.
- `pattern`: A POSIX regular expression pattern to match against the string.
- `flags` (optional): A string of one or more single-letter flags that modify how the regular expression is interpreted.

The function returns an array of text values, where each element corresponds to a substring within the first match of the pattern in the input string. If there are no matches, the function returns NULL. If there are no capture groups in the pattern, the array contains a single element with the full match.

## Example usage

Consider a table `log_entries` with a `log_text` column containing log messages. We can use `regexp_match()` to extract specific information from these logs.

```sql
WITH log_entries AS (
  SELECT '[2024-03-04 10:15:30] INFO: User john_doe logged in from 192.168.1.100' AS log_text
  UNION ALL
  SELECT '[2024-03-04 10:20:45] ERROR: Failed login attempt for user jane_smith from 10.0.0.50' AS log_text
  UNION ALL
  SELECT '[2024-03-04 10:25:55] INFO: User admin logged out' AS log_text
)
SELECT
  regexp_match(log_text, '\[(.*?)\] (\w+): (.*)$') AS parsed_log
FROM log_entries;
```

This query extracts the timestamp, log level, and message from each log entry. The regular expression pattern `\[(.*?)\] (\w+): (.*)$` captures three groups:

1. The timestamp between square brackets
2. The log level (INFO, ERROR, etc.), which is alphabetical and terminated with a colon
3. The rest of the message

```text
                                       parsed_log
-----------------------------------------------------------------------------------------
 {"2024-03-04 10:15:30",INFO,"User john_doe logged in from 192.168.1.100"}
 {"2024-03-04 10:20:45",ERROR,"Failed login attempt for user jane_smith from 10.0.0.50"}
 {"2024-03-04 10:25:55",INFO,"User admin logged out"}
(3 rows)
```

## Advanced examples

### Use `regexp_match()` with regex flags

The `regexp_match()` function accepts optional flags to modify how the regular expression is interpreted. Here's an example using the 'i' flag for case-insensitive matching:

```sql
WITH user_agents AS (
  SELECT 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36' AS user_agent
  UNION ALL
  SELECT 'Mozilla/5.0 (iPhone; CPU iPhone OS 14_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Mobile/15E148 Safari/604.1' AS user_agent
  UNION ALL
  SELECT 'CHROME/91.0.4472.124' AS user_agent
)
SELECT
  regexp_match(user_agent, '(chrome|safari|firefox|msie|opera)\/[\d\.]+', 'i') AS browser
FROM user_agents;
```

This query extracts the browser name and version from user agent strings, using case-insensitive matching.

```text
 browser
----------
 {Chrome}
 {Safari}
 {CHROME}
(3 rows)
```

### Use `regexp_match()` in a WHERE clause

You can use `regexp_match()` in a WHERE clause to filter rows based on a regex pattern:

```sql
WITH emails AS (
  SELECT 'john.doe@example.com' AS email
  UNION ALL
  SELECT 'jane.smith@company.co.uk' AS email
  UNION ALL
  SELECT 'support@mydomain.io' AS email
)
SELECT *
FROM emails
WHERE regexp_match(email, '^[^@]+@[^@]+\.(com|org|io)$') IS NOT NULL;
```

This query selects all rows from the `emails` table where the email address ends with `.com`, ``.org`, or `.io`.

```text
        email
----------------------
 john.doe@example.com
 support@mydomain.io
(2 rows)
```

## Additional considerations

### Performance implications

Using `regexp_match()` can be computationally expensive, especially on large datasets or with complex patterns. For better performance:

1. Use simpler patterns when possible.
2. Consider using `LIKE` or `SIMILAR TO` for simple pattern matching.
3. If you frequently filter based on regex patterns, consider creating a functional index using the `regexp_match()` expression.

### NULL handling

`regexp_match()` returns NULL if there's no match or if the input string is NULL. This behavior can be useful in `WHERE` clauses but may require careful handling in `SELECT` lists.

### Alternative functions

- `regexp_matches()`: Returns a set of all matches, useful for extracting multiple occurrences of the pattern in the input string.
- `regexp_replace()`: Replaces substrings matching a regex pattern within a specified string.
- `regexp_split_to_array()`: Splits a string using a regex pattern as the delimiter and returns the result as an array.
- `substring()`: Extracts substrings based on a regex pattern similar to `regexp_match()`, but only returns the first captured group of the match.

## Resources

- [PostgreSQL documentation: Pattern Matching](https://www.postgresql.org/docs/current/functions-matching.html)
- [PostgreSQL documentation: Regular Expression Details](https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP)
- [Regular Expression Tester](https://regex101.com/): A useful tool for testing and debugging regular expressions

---

## Related docs (String functions)

- [concat](https://neon.com/docs/functions/concat)
- [substring](https://neon.com/docs/functions/substring)
- [lower](https://neon.com/docs/functions/lower)
- [regexp_replace](https://neon.com/docs/functions/regexp_replace)
- [trim](https://neon.com/docs/functions/trim)
