Troubleshoot Elasticsearch - Administrator Guide - 6.10 - Cortex XSOAR - Cortex - Security Operations

Cortex XSOAR Administrator Guide

Product
Cortex XSOAR
Version
6.10
Creation date
2022-10-13
Last date published
2024-09-05
End_of_Life
EoL
Category
Administrator Guide
Abstract

Troubleshoot common issues in Cortex XSOAR Elasticsearch deployments.

Your Elasticsearch deployment can have issues with feed ingestion, memory, or general functionality.

After reviewing the troubleshooting items in the table below, if you need to create a support ticket:

  1. Set log level to debug by going to SettingsAboutTroubleshooting.

  2. Reproduce the issue and download server log bundles.

    (High Availability) For high availability deployments, you only need to download the server log bundle once, as it gathers logs from all online application servers. If your high availability servers are behind a load balancer and the log bundle times out, increase the timeout on your load balancer to five minutes.

  3. Attach the logs to the support ticket.

General Issues

Issue

Description

Recommendation

limit of total fields (X) in index has been exceeded

Mapping in Elasticsearch exceeded maximum configured field capacity.

Use totalFields under the Elasticsearch configuration in your demisto.conf to set matching index to a custom total fields. By default, common-incident, common-indicator, and common-evidence are set to 2000. You can also set the total limit on a selected index using:

PUT <indexFullName>/_settings { "index.mapping.total_fields.limit": 2000 }

field expansion matches too many fields, limit: X, got: X+Y

The number of fields a query can target exceeded the limit of X.

Set indices.query.bool.max_clause_count in elasticsearch.yml file to a higher value and restart Elasticsearch.

request to elasticsearch exceeded maximum size, increase

'http.max_content_

length'

in your elasticsearch.yml to allow larger requests

or

(413) request entity too large

A request to Elasticsearch exceeded the maximum size of http.max_content_

length.

If the request is a bulk save, decrease the innerBatchSize bulk size that is set to 250 by default. Alternatively, increase the cluster http.max_content_length settings in elasticsearch.yml to increase the maximum limit for Elasticsearch received requests.

too many requests to elasticsearch

  1. Verify the Elasticsearch cluster state is healthy (green) and all nodes are optimized and balanced. If the too many requests error is from a specific node, it might be related to node optimization.

  2. Verify the limit on the number of open file descriptors is sufficient to handle the number of parallel requests.

  3. If the error occurs during batch update operations, decrease the base batch size used by Cortex XSOAR when sending requests to Elasticsearch. Edit innerBatchSize under elasticsearch in your demisto.conf file and restart Cortex XSOAR. See the full list of Elasticsearch configurations.

  4. If the error occurs when using the migration to Elasticsearch tool, rerun the migration with the -elastic-batch-size flag to lower the batch request size. See the full list of migration flags.

unable to search on entries

By default, indexing entries content is disabled for performance reasons.

Set the db.index.entry.disable server configuration to false to enable indexing.

You can also index notes, changes and evidences for searches only using the granular.index.entries: 1 server configuration together with db.index.entry.disable: false.

Note

After making these changes, you will be required to wait for a new month to apply the new mapping. To apply the change to the existing entries you will need to reindex the common-entry_* indices.

too many open files

By default, most Linux distributions ship with 1,024 file descriptors allowed per process. This is too low for even an Elasticsearch node that needs to handle hundreds of indices.

Increase your file descriptor count to 64,000.

[400] Failed with error [1:417] [bool] failed to parse field [must]. Other reasons: [[{x_content_parse_

exception [1:417] [bool] failed to parse field [must]}]]

Queries with multiple or conditions, such as "id:1 or id:2 or id:3 or id:4 or id:5 or id:6 or id:7 or id:8 or id:9 or id:10" may fail when using Elasticsearch v7.14 or later.

Avoid using or by providing a value list "id:(1 2 3...)"

cannot restore index [.geoip_databases] because an open index with same name already exists in the cluster. Either close or delete the existing index or restore the index under a different name by providing a rename pattern and replacement name

When using Elasticsearch v7.14 or later, you may encounter failures when restoring from snapshots.

Add ingest.geoip.downloader.enabled: false to your Elasticsearch configuration file.

ReleasableBytesStream

Output

cannot hold more than 2GB of data. Other reasons:

[[{illegal_argument_

exception

ReleasableBytesStream

Output

cannot hold more than 2GB of data}]] ) [error '[400] Failed with error:

ReleasableBytesStream

Output

cannot hold more than 2GB of data

If there are insufficient shards, Elasticsearch’s circuit breaker limit may be reached due to the search load.

Increase the number of shards. For example, if you have a three-data-nodes cluster, you should have at least two replicas for each active shard, making the data available across all nodes. We also recommend using nodes stats to verify the nodes are balanced for read and write operations.

Data too large, data for [<xxx>] would be [xxxx/xxxx], which is larger than the limit of [xxxxx/xxxxx]

Elasticsearch's circuit breaker limit is reached due to the load of indexing operations.

  1. Verify you have sufficient JVM. The Elasticsearch default setting is significantly lower than required.

  2. Verify you have implemented Cortex XSOAR best practices for Elasticsearch and set up a sufficient number of primary shards for the heavily used indices.

  3. Verify the cluster is balanced correctly by using nodes stats. Identify unbalanced nodes that store most or all primary shards, causing node write to queue and causing decreased performance on all shards in the node. In this case, we recommend moving some primary shards to a less busy node and/or adding additional primary shards to ensure the nodes are balancing their resource usage (i.e. CPU, memory and IO).

HTTP 504 gateway request timeouts

HTTP requests are timing out and preventing playbook data from loading in the browser.

If your high availability servers are behind a load balancer, increase the timeout on your load balancer to 300s.

Memory Issues

Issue

Description

Recommendation

Insufficient JVM memory

The default JVM memory is 1 GB. In production environments, this might be insufficient.

Increase the JVM memory

Note

You should set the JVM to no more than 50% of total machine memory and not more than 32GB.

Insufficient term query size

The term query size is used by the bulk edit. The default term query size is 65536 and may be insufficient.

Increase the term query size.

Insufficient bulk size

The bulk size depends on the available JVM memory and affects the amount of data that Cortex XSOAR can send and process in Elasticsearch.

Heap size

The recommended maximum heap size is 50% of the entire server, as long as the other 50% is free.

Performance issues due to swapping enabled

Disable swapping in Elasticsearch to improve performance.

Feed Ingestion Issues

Issue

Description

Recommendation

Stack overflow

In some cases, complex search queries cause Elasticsearch to fail on stack overflow.

Use the following search query syntax: field:(a,b,c …), which is based on a clause count. In this example, a, b, and c each represent a clause.

To determine how many clauses a query can contain, set the maximum clause count and the maximum total field count in the elasticsearch.yml file.

Maximum clause count

Key:

For ES 6.0 and later the key is index.query.bool.max_clause_count.

For ES 5.x and earlier the key is indices.query.bool.max_clause_count.

Default: 1,024. You can increase the value.

Maximum total field count

Key: index.mapping.total_fields.limit

Default: 1,000. You can increase the value.