Learn more about the Cortex Query Language approx_top
approximate aggregate comp function.
Syntax
comp approx_top as count
comp approx_top(<string field>, <number>) [as <alias>] [by <field1>[,<field2>...]][addrawdata = true|false [as <target field>]]
comp approx_top as sum
comp approx_top(<string field>, <number>, <weight string field>) [as <alias>] [by <field1>[,<field2>...]][addrawdata = true|false [as <target field>]]
Description
The approx_top
approximate aggregate is a comp function that, depending on the number of parameters, returns either an approximate count or sum of top elements. This approximate aggregate function returns a single value for the given field over a group of rows, for all records that contain matching values for the fields identified in the by
clause. This function is used in combination with a comp
stage. When a third parameter is specified, it references a field that contains a numeric value (weight) that is used to calculate a sum. The return value is an array with up to <number>
of JSON strings. Each string represents an object (struct) containing 2 keys and corresponding values. The keys depend on whether a third parameter has been supplied or not.
When defining approx_top
to count and the third parameter is omitted, each struct will have these keys: "value" and "count", where the "value" specifies a unique field value and "count" specifies the number of occurrences. When the third parameter is specified in approx_top
, it has to be a name of a field that contains a numeric value that is used to calculate the final sum for each unique value in the first specified field. Each struct in this case will have these keys: "value" and "sum".
In addition, you can configure whether the raw data events are displayed by setting addrawdata
to either true
or false
(default), which are used to configure the final comp
results. When including raw data events in your query, the query runs for up to 50 fields that you define and displays up to 100 events.
Use this approximate aggregate function to produce approximate results, instead of exact results used with regular aggregate functions, which are more scalable in terms of memory usage and time.
Examples
comp approx_top as count
Returns an approximate count of the top 10 agent IDs in the agent_id
field that appear the most frequently. The return value is an array containing 10 JSON strings with a "value" and "count".
dataset = xdr_data | fields agent_id | comp approx_top(agent_id, 10)
comp approx_top as sum
Returns an approximate sum of the top 10 agent IDs in the agent_id
field by their action_session_duration
. The return value is an array containing 10 JSON strings with a "value" and "sum" for each agent_id
.
dataset = xdr_data | fields agent_id, action_session_duration | comp approx_top(agent_id, 10, action_session_duration)