Broker VM High Availability Cluster - Administrator Guide - Cortex XDR - Cortex

Broker VM High Availability Cluster - Administrator Guide - Cortex XDR - Cortex - Security Operations

Cortex XDR Pro Administrator Guide

Product

Cortex XDR

License

Pro

Creation date

2024-02-26

Last date published

2024-04-21

Cluster Architecture

The Clusters tab on the Broker VMs page enables you to view your cluster configurations, which displays the associated nodes, node statuses, applets configured, and applet statuses. You can add as many clusters as you want in a tenant. Each Cortex XDR cluster can include as many nodes as you need. The cluster operation is fully managed from the tenant, and there is no need to install additional components. There is no need for cluster nodes to communicate with one another on the network. In each cluster, one Broker VM is designated as the Primary cluster node and the rest of the nodes are designated as standby nodes. The cluster architecture is dependent on the type of applets configured in the cluster. Applets on cluster nodes run either in the active/active mode or in the active/passive mode and exhibit different behaviors as detailed in the table below.

Note

With Cortex XDR Prevent, it's only relevant to configure a HA cluster with a Local Agent Settings applet as this is the only applet supported for this product license. The other applets are collector applets, which are only available in Cortex XDR Pro or Cortex XSIAM.

Applet Mode	Applet Behavior	Applets
active/active	The applets that operate in the active/active mode run simultaneously on all the nodes in the cluster to achieve High Availability and Load Balancing. Failure of an applet on a particular node causes all traffic to be redistributed to the remaining nodes in the HA cluster. Note For load balancing, you must install a Load Balancer in your network which will distribute the incoming data between the nodes.	The active/active applets are: Syslog Collector Netflow Collector Windows Event Collector Local Agent Settings
active/passive	The applets that operate in the active/passive mode run only on the Primary node designated in the cluster. The other nodes are synchronized and ready to transition from standby to the active Primary node should there be a failover. In this mode, all nodes share the same configuration settings, while only 1 operates at a given time.	The active/passive applets are: Kafka Collector Network Mapper CSV Collector FTP Collector Files and Folders Collector DB Collector

Note

The Pathfinder applet isn't supported when configuring Broker VMs in HA clusters.

Automatic Failover

In each cluster, whenever there's a failure on the Primary node, Cortex XDR automatically switches to one of the standby nodes, initiates the applets on the new Primary node, and continues data collection on that node. Any successful or unsuccessful failover attempt displays an alert in the notification area and is logged in the Management Audit Logs table.

The following conditions can trigger a failover for the Primary node:

Connectivity issues between a Primary node and the Cortex XDR server.
Application failure, such as failing to start an applet or an applet crashes.
Any failure of one of the internal components, such as MariaDB, Redis, RabbitMQ, or Docker engine.
Hardware failure, including:
- Running out of disk space
- CPU usage of more than 95% for more than 10 minutes
- Memory usage of more than 95% for more than 10 minutes

Manual Switchover

At any time, you can change the role of the current Primary node in the cluster to another node in the HA cluster, for example, to perform maintenance, by initiating a manual switchover.

Automatic Upgrades

You can configure automatic upgrades within Broker VM HA cluster nodes to update cluster nodes without noticeable down-time or other disruption of the HA cluster service by implementing the rolling upgrade mechanism. An automatic upgrade is performed in the following order:

Standby nodes are upgraded one by one.
The Primary node is switched over to one of the upgraded standby nodes.
The previous Primary node, now a standby node, is upgraded.