High Availability for Cortex XSOAR - Ensure reliable and continuous operation with High Availability. - Administrator Guide - 8.13 - Cortex XSOAR - Cortex - Security Operations

Cortex XSOAR On-prem Documentation

Product
Cortex XSOAR
Version
8.13
Creation date
2026-02-12
Last date published
2026-05-27
Category
Administrator Guide
Solution
On-prem
Abstract

Ensure reliable and continuous operation with High Availability.

High availability keeps your systems running even if one of your components fails. It provides redundancy for the different components, so if a problem occurs, it has a minimal effect on your system.

If you deploy a cluster of three nodes and set the Cortex XSOAR IP address access to either a virtual IP or the reverse proxy/ingress controller IP, the system implements built-in high availability. This enables workload distribution and data replication across the nodes, and continuous operation in case one node fails.

Note

Kubernetes requires a majority of control plane nodes to be online for it to function, so a three node cluster requires two to be online. If two nodes fail but are fixed and go back online, the cluster will recover. However, if two nodes fail and are not able to go back online, open a support session for assistance.

Built-in High Availability

Built-in High Availability works as follows:

  • Tasks and data are distributed across the nodes to balance the load.

  • Data is replicated across nodes, ensuring no single point of failure.

  • If a node goes down, workloads on the failed node are automatically distributed to the other nodes.

    Note

    There may be several minutes of downtime until the other nodes take over.

  • Once the failed node is restored, it automatically reintegrates into the cluster and the workloads are automatically rescheduled.

For more information on setting up built-in High Availability for your specific deployment by deploying a cluster of three nodes, see Cortex XSOAR Installation.Cortex XSOAR Installation

Backup and restore between primary and secondary data centers

Once you deploy your cluster, you can utilize disaster recovery functionality using backup and restore operations.

Important

The restore environment must run the same Cortex XSOAR version with the same resources as the original environment to ensure seamless restoration (the clusters must be the same).

With periodic backups of the cluster to external storage, if the original cluster becomes unavailable, you can easily restore it from the external storage. For more information, see Back up data.

Monitor and manage nodes

Once you set up and install your cluster, you can monitor node status and recover from node failure as needed.

  1. In Cortex XSOAR, monitor the node health on the System Diagnostics page. For more information, see View system status in the System Diagnostics page.

  2. If there is a node failure, manage the nodes from the textual UI.

    For example, if a node fails remove it and then add a new node to replace it. For more information, see Manage nodes in a cluster.

    You need to set the host again and reestablish trust between all the nodes if you want to replace a node in the cluster after completing the installation.