High Availability Overview - Administrator Guide - 6.6 - Cortex XSOAR - Cortex

High Availability Overview - Administrator Guide - 6.6 - Cortex XSOAR - Cortex - Security Operations

Cortex XSOAR Administrator Guide

Product

Cortex XSOAR

Version

6.6

Creation date

2022-09-29

Last date published

2024-04-08

End_of_Life

EoL

Flow

Implementing a high availability deployment requires you to prepare your environment with the following:

Migrate your existing database to Elasticsearch or, for new customers, install a new instance of Elasticsearch.
Configure a load balancer.
Configure a shared file server. For single-instance deployments, this needs to be done between the app servers, and for multi-tenant deployments, this needs to be done for the main hosts, as well as each HA group.
Proceed with the implementation of the high availability configuration.
Migrate a Single Instance for High Availability
Migrate a Multi-Tenant Deployment for High Availability

Architecture

Depending on the configuration of your system, single instance or multi-tenant, you can achieve high availability using one of the following architectures.

Single instance deployment

In a single-instance deployment, the application server is installed on a dedicated machine and connects to an Elasticsearch database server. The Elasticsearch database server automatically provides redundancy in accordance with how you have configured Elasticsearch.

In addition, you can also install multiple application servers behind a load balancer. Requests are managed by the load balancer, most commonly, but not required, using a round-robin methodology.

Also, the app servers use a shared file system to ensure that all of the necessary files are available to all of the application servers in the cluster.

Multi-Tenant Deployment

Similar to a single instance, in a multi-tenant configuration you must first migrate your data to an Elasticsearch database, separating the main account server from the database server. Elasticsearch, depending on how it is configured, provides the database redundancy.

To achieve full high availability, you can then install multiple main account servers behind a load balancer, which also uses an NFS server to share the required files, and communicate with the designated index in Elasticsearch.

Note

Once the main host servers are highly available, you can no longer host new accounts on those servers. Existing accounts on the Main host will still exist, but will not be highly available. Therefore, Cortex XSOAR recommends that you move the accounts from the Main host to an HA group.

In addition, Cortex XSOAR v6.2 and above enables you to create high availability groups (HA groups), which form a cluster of hosts that provide redundancy for all of the accounts on those hosts. The HA groups also use an NFS server to share the required files, and communicate with the designated index in Elasticsearch.

HA groups provide your system with:

Redundancy for every account in the group
Redundancy for every host machine in the group
Performance improvements for every account in the group

Note

All hosts in the HA Group must have the same hardware specifications.

In the full high availability architecture, you can still have hosts that are not part of an HA group and work with Elasticsearch, and you can also maintain hosts that use the out-of-the-box configuration where the application server and database are on the same machine, using a Bolt/Bleve database.

High Availability and Remote Repositories

The remote repositories feature in the UI is not supported on development environments that run as High Availability (multi-app servers). You can still use a development > staging > production set up, where development is a single server (not High availability), but production can be High Availability. In this setup, both staging and production pull from the same git repository. If your development environment runs as High Availability, use the CI/CD Solution.

Load Balancing

You can install multiple application servers behind a load balancer. The load balancer must be configured to use sticky sessions and request timeouts. The load balancer should use a health check to verify that the App Server is up and running. Cortex XSOAR recommends you use the /health route and verify that you receive a 200 HTTP response.

Load balancing within HA groups for Multi-Tenant Deployment

Load balancing within HA groups uses an internal mechanism that distributes requests between all available hosts in the group. By default, the method used is Round Robin, but you can change it to random using the ha.host.selection.alg server configuration.

When a host is alive and reachable, the requested account needs to be alive and reachable as well for the request to be directed to it:

On the host level, as long as at least 1 host in an HA group is alive and reachable, the host and all of the accounts within that group are fully available.
On the account level, as long as an account is alive and reachable on any of the hosts within that group, requests will be directed to it, offering full availability for that account
Load balancing on the Main account level is handled by the external load balancer.

Engines and Load balancing

Before you install an engine, ensure that you set both the Base URL and the External Host Name to the IP address of the Load Balancer. ( Settings → About → Troubleshooting).

The Base URL enables engines to connect to the Load Balancer. Setting the external host name to the load balancer enables the engines to connect to the Load Balancer even if one of the app servers fails. You can also add or remove app servers but the engines remain connected.

Note

When you upgrade to v6.5 or later from a previous version, you need to create either a new engine or edit the configuration file.