Performance tuning may include general troubleshooting for memory and CPU usage, or troubleshooting of more specific UI issues, for example playbook execution. The information in this article will help you identify common causes of slow system performance and implement system improvements.
Memory Issues
Memory issues are a common cause of slow system performance. To verify if memory issues are causing performance issues, check for memory spikes in the system health dashboard and search the journalctl log for the following entry: kernel: Out of memory: Kill process X (server) score X or sacrifice child
.
Issue | Relevant Database | Verification | Solution |
---|---|---|---|
Storing large amount of historical data (more than 1 year) | BoltDB | Check the file sizes in the partition folder to verify that no file is larger than 10GB. The default directory is | Archive old data. For more information, learn how to Free up Disk Space with Data Archiving. |
Inefficient playbooks storing too much data | BoltDB | Validate with Largest Incidents by Storage Size widget. This widget is part of the Common Widgets content pack. | Archive old data. For more information, learn how to Free up Disk Space with Data Archiving. |
Playbooks performing many loops or creating many entries | BoltDB | Verify with dbstats route | Reduce the number of loops and/or entries. |
Server machine does not meet the minimum memory requirements | BoltDB / Elasticsearch | Open the | Verify that your server meets the minimum memory requirements. |
Storage does not meeting minimum requirements | BoltDB | N/A | Verify that the disk is SSD based, with 3k dedicated IOPS. |
Too much data indexing of investigation tasks or entries | BoltDB | Check folders with prefix | Add the following server configuration and value to limit indexing: |
Docker containers | BoltDB / Elasticsearch | Check | |
Audits index folder | BoltDB | Check the file sizes in the audits index folder to verify that no file is larger than 3GB. The default directory is | Perform scheduled and manual audit purges. |
Messages index | BoltDB | Check the file size of the messages index to verify it isn’t larger than 3GB. The default directory is | Drop messages and delete the index. Add configuration setting: |
CPU Spikes
Check for CPU spikes by viewing the system health dashboard or by using system tools such as the Linux top command.
Issue | Relevant Database | Verification | Solution |
---|---|---|---|
Too many threads | BoltDB / Elasticsearch | Check threads in the | Restart service. If problem reoccurs, contact Cortex XSOAR customer support. |
Workers overloaded | BoltDB / Elasticsearch | Check | |
Docker containers overloaded | BoltDB / Elasticsearch | Check | Change limit for pool of running Docker images (default is 20) with the server configuration: |
Slow Playbooks
Issue | Relevant Database | Verification | Solution |
---|---|---|---|
Indicator Extraction Enabled | BoltDB / Elasticsearch | N/A | Check Indicator Extraction settings and turn off Indicator Extraction where not needed. See Indicator Extraction. |
Enrichment integrations that fail or timeout | BoltDB / Elasticsearch | Verify you don’t have enrichment integrations that fail frequently or experience timeouts. This might occur with free enrichment services that quickly exceed quotas. | Depending on the integration, you might need to increase the quota or modify integration settings. |
Playbooks storing a large amount of data | BoltDB / Elasticsearch | Check if playbooks are storing more than 0.5 MB per incident. Confirm by running | View playbook metadata to understand which tasks are generating large amounts of data and then optimize tasks to reduce data storage. |
Complex playbooks with many tasks | BoltDB / Elasticsearch | Check for playbooks with a large number of tasks or sub playbooks. | Confirm that the |
Other Possible Performance Issues
Issue | Relevant Database | Verification | Solution |
---|---|---|---|
Insufficient disk space | BoltDB | Check the | Archive old data. For more information, learn how to Free up Disk Space with Data Archiving. |
Indicators page | BoltDB / Elasticsearch | Check for noticeable lags on the Indicators page. | Exclude indicators that appear in every incident. Check Indicator Extraction settings and turn off Indicator Extraction where not needed. See Indicator Extraction. |
Overall slow UI | BoltDB / Elasticsearch | Check network latency and ping other Cortex XSOAR components, such as engines from the server. | Check with your IT admin regarding latency between client and server. |
Large exclusion list impacts system performance | BoltDB/Elasticsearch | Go to → → to check if list exceeds 1000 entries (yellow, Issue) or 5000 entries (red, At Risk). |
|
WebSockets
If the Cortex XSOAR server is responding slowly and does not receive data updates on certain pages and actions, the WebSocket might be disconnecting.
Confirm that this issue persists across different browsers (Chrome, Firefox, etc.)
Check WebSocket messages on the server.
Check the
server.log
file for messages to confirm that the WebSocket scenario is working, e.g., WebSocket req arrived and HTTP connection upgraded to WebSocket.Check for WebSocket errors such as: Closing WebSocket ReadPump with err: websocket: close 1005 (no status).