Learn more about creating Broker VMs in a High Availability Cluster
High availability (HA) is a deployment in which at least two Broker VMs are placed in a Broker VM cluster and their configuration is synchronized to prevent a single point of failure on your network at the hardware and application level. A heartbeat connection between the Broker VM nodes and the Cortex XDR Server ensures seamless failover if a node fails. Setting up a HA cluster provides redundancy and enables data collection continuity.
The Clusters tab on the Broker VMs
page enables you to view your cluster configurations, which displays the associated nodes, node statuses, applets configured, and applet statuses. You can add as many clusters as you want in a tenant. Each Cortex XDR cluster can include as many nodes as you need. The cluster operation is fully managed from the tenant, and there is no need to install additional components. There is no need for cluster nodes to communicate with one another on the network. In each cluster, one Broker VM is designated as the Primary cluster node and the rest of the nodes are designated as standby nodes. The cluster architecture is dependent on the type of applets configured in the cluster. Applets on cluster nodes run either in the active/active mode or in the active/passive mode and exhibit different behaviors as detailed in the table below.
Note
With Cortex XDR Prevent, it's only relevant to configure a HA cluster with a Local Agent Settings applet as this is the only applet supported for this product license. The other applets are collector applets, which are only available in Cortex XDR Pro or Cortex XSIAM.
Applet Mode | Applet Behavior | Applets |
---|---|---|
active/active | The applets that operate in the active/active mode run simultaneously on all the nodes in the cluster to achieve High Availability and Load Balancing. Failure of an applet on a particular node causes all traffic to be redistributed to the remaining nodes in the HA cluster. NoteFor load balancing, you must install a Load Balancer in your network which will distribute the incoming data between the nodes. | The active/active applets are:
|
active/passive | The applets that operate in the active/passive mode run only on the Primary node designated in the cluster. The other nodes are synchronized and ready to transition from standby to the active Primary node should there be a failover. In this mode, all nodes share the same configuration settings, while only 1 operates at a given time. | The active/passive applets are:
|
Note
The Pathfinder applet isn't supported when configuring Broker VMs in HA clusters.
In each cluster, whenever there's a failure on the Primary node, Cortex XDR automatically switches to one of the standby nodes, initiates the applets on the new Primary node, and continues data collection on that node. Any successful or unsuccessful failover attempt displays an alert in the notification area and is logged in the Management Audit Logs table.
The following conditions can trigger a failover for the Primary node:
Connectivity issues between a Primary node and the Cortex XDR server.
Application failure, such as failing to start an applet or an applet crashes.
Any failure of one of the internal components, such as MariaDB, Redis, RabbitMQ, or Docker engine.
Hardware failure, including:
Running out of disk space
CPU usage of more than 95% for more than 10 minutes
Memory usage of more than 95% for more than 10 minutes
At any time, you can change the role of the current Primary node in the cluster to another node in the HA cluster, for example, to perform maintenance, by initiating a manual switchover.
You can configure automatic upgrades within Broker VM HA cluster nodes to update cluster nodes without noticeable down-time or other disruption of the HA cluster service by implementing the rolling upgrade mechanism. An automatic upgrade is performed in the following order:
Standby nodes are upgraded one by one.
The Primary node is switched over to one of the upgraded standby nodes.
The previous Primary node, now a standby node, is upgraded.