regexcapture - Reference Guide - Cortex XDR - Cortex - Security Operations

Cortex XDR XQL Language Reference

Product
Cortex XDR
Creation date
2024-02-26
Last date published
2024-04-21
Category
Reference Guide
Abstract

Learn more about the Cortex Query Language regexcapture() function used in Parsing Rules to extract data from fields using regular expression named groups from a given string.

Important

The regexcapture() function is only supported in the XQL syntax for Parsing Rules.

Syntax

regexcapture(<field>, "<pattern>")

Description

In Parsing Rules, the regexcapture() function is used to extract data from fields using regular expression named groups from a given string and returns a JSON object with captured groups. This function can be used in any section of a Parsing Rule. The regexcapture() function is useful when the regex pattern is not identical throughout the log, which is required when using the regextract function.

XQL uses RE2 for its regular expression implementation. When using the (?i) syntax for case-insensitive mode in your query, this syntax should be added only once at the  beginning of the inline regular expression.

Example

Parsing Rule to ceate a dataset called my_regexcapture_test, where the vendor and product that the specified Parsing Rules applies to is called regexcapture_vendor and regexcapture_product. The output results includes a new field called regexcaptureResult, which extract data from the _raw_log field using regular expression named groups as defined and returns the captured groups.

Parsing Rule:

[INGEST:vendor="regexcapture_vendor", product="regexcapture_product", target_dataset="my_regexcapture_test"]
alter regexcaptureResult = regexcapture(_raw_log,"^(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) - (?P<user>\w+) \[(?P<timestamp>.+)\] (?P<request>.+) (?P<status>\d{3}) (?P<bytes>\d+)");

Log:

192.168.1.1 - john [10/Mar/2024:12:34:56 +0000] GET /index.html HTTP/1.1 200 1234

XQL Query:

For the my_regexcapture_test dataset, returns the regexcaptureResult field output.

dataset = my_regexcapture_test 
| fields regexcaptureResult

regexcaptureResult field output:

{
  "ip": "192.168.1.1",
  "user": "john",
  "timestamp": "10/Mar/2024:12:34:56 +0000",
  "request": "GET /index.html HTTP/1.1",
  "status": "200",
  "bytes": "1234"
}