approx_top - Administrator Guide - Cortex XDR - Cortex - Security Operations

Cortex XDR Documentation

Product
Cortex XDR
Creation date
2024-03-06
Last date published
2024-10-01
Category
Administrator Guide
Abstract

Learn more about the Cortex Query Language approx_top approximate aggregate comp function.

Syntax
comp approx_top as count
comp approx_top(<string field>, <number>) [as <alias>] [by <field1>[,<field2>...]][addrawdata = true|false [as <target field>]] 
comp approx_top as sum
comp approx_top(<string field>, <number>, <weight string field>) [as <alias>] [by <field1>[,<field2>...]][addrawdata = true|false [as <target field>]]
Description

The approx_top approximate aggregate is a comp function that, depending on the number of parameters, returns either an approximate count or sum of top elements. This approximate aggregate function returns a single value for the given field over a group of rows, for all records that contain matching values for the fields identified in the by clause. This function is used in combination with a comp stage. When a third parameter is specified, it references a field that contains a numeric value (weight) that is used to calculate a sum. The return value is an array with up to <number> of JSON strings. Each string represents an object (struct) containing 2 keys and corresponding values. The keys depend on whether a third parameter has been supplied or not.

When defining approx_top to count and the third parameter is omitted, each struct will have these keys: "value" and "count", where the "value" specifies a unique field value and "count" specifies the number of occurrences. When the third parameter is specified in approx_top, it has to be a name of a field that contains a numeric value that is used to calculate the final sum for each unique value in the first specified field. Each struct in this case will have these keys: "value" and "sum".

In addition, you can configure whether the raw data events are displayed by setting addrawdata to either true or false (default), which are used to configure the final comp results. When including raw data events in your query, the query runs for up to 50 fields that you define and displays up to 100 events.

Use this approximate aggregate function to produce approximate results, instead of exact results used with regular aggregate functions, which are more scalable in terms of memory usage and time.

Examples
comp approx_top as count

Returns an approximate count of the top 10 agent IDs in the agent_id field that appear the most frequently. The return value is an array containing 10 JSON strings with a "value" and "count".

dataset = xdr_data
| fields agent_id
| comp approx_top(agent_id, 10)
comp approx_top as sum

Returns an approximate sum of the top 10 agent IDs in the agent_id field by their action_session_duration. The return value is an array containing 10 JSON strings with a "value" and "sum" for each agent_id.

dataset = xdr_data
| fields agent_id, action_session_duration
| comp approx_top(agent_id, 10, action_session_duration)