Train a Classifier on Other Languages - Administrator Guide - EoL - 6.5 - Cortex XSOAR - Cortex - Security Operations

Cortex XSOAR Administrator Guide

Product
Cortex XSOAR
Version
6.5
Creation date
2022-09-28
Last date published
2024-07-16
Category
Administrator Guide
End of Life > EoL
Abstract

Train a classifier for non English machine learning. Adjust the language and tokenization method by which to train a classifier on in Cortex XSOAR.

To train a classifier on languages other than those referred to in Train a Classifier on Languages with Adjusted Tokenization, you need to configure the language and tokenization method. Tokenization is the method by which the classifier breaks up sentences and words to analyze threats appropriately. When the language for the classifier is configured to Other, the user can configure the method of tokenization by which to train a classifier on to one of the following options:

  • Tokenization - (Default) automatically separate sentences by words

  • Word - separates the text based on spacing

  • Letter - separates the text based on charachters and symbols

Follow the steps below to adjust the language and tokenization method by which to train a classifier on for other languages.

  1. Go to Automation.

  2. Search for DBotPreProcessTextData.

    1. Copy the automation by selecting Duplicate Automation.

    2. (Optional) Change the name of the duplicated script to make it distinguishable.

    3. From the Argument section, expand the tokenizationMethod field, and change the Initial value to the desired tokenization method. For example, byWord.

    4. Expand the language field and change the Initial value to Other.

    5. Click Save.

  3. Search for DBotPredictPhishingWords.

    1. Copy the automation, by selecting Duplicate Automation.

    2. (Optional) Change the name of the duplicated script to make it distinguishable.

    3. From the Argument section, expand the tokenizationMethod field, and change the Initial value to the desired tokenization method. For example, byWord.

    4. Expand the language field, and change the value to Other.

    5. Click Save.

  4. Navigate to Playbooks.

  5. Search for the DBot Create Phishing Classifier V2 playbook to update.

    1. Copy the playbook by selecting Duplicate Playbook.

    2. (Optional) Change the name of the duplicated playbook to make it distinguishable.

    3. Select the Pre-process file task.

    4. From the dropdown menu replace the automation with the duplicated version of DBotPreProcessTextData created in Step 2.

    5. Click OK and Save Version.