Automatic De-Duplication Using Scripts - Administrator Guide - 8 - Cortex XSOAR - Cortex - Security Operations

Cortex XSOAR Administrator Guide

Product
Cortex XSOAR
Version
8
Creation date
2023-11-02
Last date published
2024-02-14
Category
Administrator Guide
Abstract

Automate de-duplication of incidents using scripts. Identify and close duplicate incidents in Cortex XSOAR.

There are various scripts you can use in scripts and playbooks to identify and close duplicate incidents:

  • FindSimilarIncidentsByText

  • FindSimilarIncidents

FindSimilarIncidentsByText
  • Identifies similar incidents based on text similarity. For this script you specify incident keys, labels, or custom fields.

  • The comparison is based on the TF-IDF method.

  • A score is calculated for each candidate (0-1), and incidents are considered duplicates when exceeding the threshold. The default threshold is 98%.

!FindSimilarIncidentsByText textFields=name,details maximumNumberOfIncidents=1000 threshold=0.95 timeFrameHours=24 ignoreClosedIncidents=no

This command example checks for duplicate incidents using the following methodology:

  1. Query for duplicate candidates:

    • Incidents created in the previous 24 hours [timeFrameHours=24].

    • Includes closed incidents [ignoreClosedIncidents].

    • Maximum number of incidents to check is 1,000 [maximumNumberOfIncidents=1000].

  2. For each candidate, concatenate name and details incident fields [textFields=name,details] into a text document.

  3. Compare the current incident text with all candidates using the TF-IDF method

  4. Check if there is at least one similar candidate:

    • Candidates with a TF-IDF score of 95% [threshold=0.95]. If there is at least one candidate, announce duplicate.

FindSimilarIncidents
  • Rule-based script that identifies similar incidents based on common incident keys, labels, custom fields, or context keys.

  • We recommend using incident keys, for example, "type" for same incident type.

  • Due to performance considerations, we recommend not using context keys, for example, if the value also appears in the label key. Each duplicate candidate creates an additional server query.

!FindSimilarIncidents similarIncidentKeys="type,severity" similarLabelsKeys="Email/from,Email/subject:*,Email/text:5" ignoreClosedIncidents="yes" maxNumberOfIncidents="1000" hoursBack="48" timeField="created" maxResults="10"

This command example checks for duplicate incidents using the following methodology:

  1. Query for duplicate candidates:

    • Incidents created in the 48 hours [hoursBack="48", timeField=created] before the original incidents

    • Excludes closed incidents [ignoreClosedIncidents=yes]

    • Maximum number of incidents to check is 1,000 [maxNumberOfIncidents=1000]

    • Filters by the same incident type and severity [similarIncidentKeys=type,severity]

  2. Check for candidate with the same Email/from label, or similar Email/subject label:

    • Contains, or contained, the original incident Email/subject label, and similar Email/text label

    • Equal or a maximum difference of 5 words from the original Email/text label [similarLabelsKeys="Email/from,Email/subject:*,Email/text:5"]

  3. If duplicate incidents are found, store the results in the context:

    • Maximum of 10 [maxResults="10"]