The following describes how to use the machine learning model in order to find incidents that are similar to the one you are investigating. This model does not need any training from the user.
DBotFindSimilarIncidents script finds past similar incidents based on an incident fields' similarity. It includes an option to also display indicator similarity. The model aims to detect similarity in a text or JSON file, even if the value is different. The script returns a summary of the query and all the past similar incidents and can be used to assist in the investigation of the current incident.
You can run the script as part of an incident or inside the Playground (using the
DBotFindSimilarIncidentsscript can run
DBotFindSimilarIncidentsByIndicators as a subscript.
Execution of the Script
The execution of the script takes place in 4 steps:
Scope of the research: Fetch only part of the incidents from the instance.
Similarity Metric: Compute incident similarity based on similarity fields that you provide.
Find similar incidents by indicators: Compute incidents similarity based on shared indicators (optional).
Output: Display results according to your settings.
Scope of the Research
You have the ability to select a limited number of incidents on which to run the model. For this purpose, the you can select the following fields:
fieldExactMatch: List of incident fields that have to be equal to the current incident fields. This helps reduce the query size. For example, you can use the type of the field if you want to find similarities only on incidents from the same type as your current incident.
fromDate: The start date by which to filter incidents.
toDate: The end date by which to filter incidents.
query: Argument for any additional query.
limit: Maximum number of incidents to fetch and execute the model on.
This is the core of the script. The model now computes similarity on incidents defined in the Scope of the Research section. You can choose the fields manually or use all the incident fields from the incidents. Manually choosing the fields could lead to more accurate results.
Manually: Enter the list of fields on which to compute the similarity according to the type of the field:
similarTextField: Value should be textual and comma-separated. (In case of a mapped incident field, there is no need to specify `incident` at the beginning). For example, it can be a command line or a URL.
similarCategoricalField: Value should be categorical. In this case, we will have a similarity only if it’s an exact match. For example, it can be hostname or IP Address.
similarJsonField: Value should be a JSON. For example, it can be xdralerts or any custom grid field.
useAllFields: Whether to use a predefined set of fields and custom fields to compute similarity. If "True", it will ignore values in
Find Similar Incidents by Indicators
includeIndicatorsSimilarity argument is set to “True”, the script will call the
DBotFindSimilarIncidentsByIndicators subscript. This subscript returns a list of incidents which share indicators with the current incident. Indicators have a score depending on their rarity, and very common indicators (which appear in a high number of incidents) are excluded. Each similar incident has a score between 0 and 1 depending on how many indicators they share and how rare the indicators are.
The output has 3 parts:
Summary of the run.
Current incident (optional - only if the
showCurrentIncidentargument is True).
List of similar incidents.
You can configure the output using the following arguments:
fieldsToDisplay: List of additional incident fields to display, but which are not taken into account when computing similarity.
aggregateIncidentsDifferentDate: Whether to aggregate the exact same rows in the output within different dates.
showIncidentSimilarityForAllFields: Similarity score is computed for each field and then aggregated in order to compute the final similarity score of the incidents. Whether to display the similarity score for each of the incident fields. If not, the script will display only the computed final similarity.
minimumIncidentSimilarity: Retain incidents with a similarity score that's higher than the
maxIncidentsToDisplay: The maximum number of incidents to display. The rest of the incidents won’t be shown in the table but will be part of the content and context.
In this example, we execute the script on a Splunk incident based on a custom similarity using some incident labels and custom fields. Notice that we need to use the field machine name (lowercase, no spaces).
`!DBotFindSimilarIncidents similarTextField="incident.labels.host,incident.labels.threat_group,incident.labels.threat_source,details,srcs,incident.labels.threat_match_value,dsts" fieldsToDisplay="cmdline,tactic,technique,hostnames,ipaddress,parentcmdline,filepaths,severity" fieldExactMatch="type" useAllFields="False" showIncidentSimilarityForAllFields="True" showCurrentIncident="True"`
In the example, we are looking for similar incidents. The search scope are incidents with the same incident type, and the similarity based on the fields: incident.labels.host,incident.labels.threat_group,incident.labels.threat_source,details,srcs,incident.labels.threat_match_value,dsts. We will display the following fields in the results: cmdline,tactic,technique,hostnames,ipaddress,parentcmdline,filepaths,severity.
The first entry is the summary. Here we see that the script fetched 1051 incidents with the criteria we defined using the
fieldExactMatch argument. These are the candidates used to calculate the similarity.
Within those incidents, the script found 3 incidents that have an overall similarity above the threshold of 0.5. The similarity score is computed from the
srcs fields. Other fields provided to compute similarity from
similarTextField cannot be found in the incidents (as the message indicates).
showCurrentIncident="True”, the script shows the current incident under investigation (with the fields that werer defined).
The last entry to be returned is the similar incidents found. The first 2 incidents have a similarity score of 1 (exact matching for the
srcs field). The last one has a similarity of 0.82 due to the small difference in the details field value. We can see that the similarity for this field is 0.63.