Datasets and Presets - Reference Guide - Cortex XDR - Cortex - Security Operations

Cortex XDR XQL Language Reference

Product
Cortex XDR
Creation date
2024-02-26
Last date published
2024-04-16
Category
Reference Guide
Abstract

The Cortex Query Language supports built-in datasets, custom datasets, and presets.

Every Cortex Query Language (XQL) dataset query begins by identifying a data source that the query will run against. Each data source has a unique name, and a series of fields. Your query specifies the data source, and then provides stages that identify fields of interest and perform operations against those fields.

You can query against either datasets or Presets in a dataset query. XQL supports using different languages for dataset and field names. In addition, the dataset formats supported are dependent on the data retention offerings available in Cortex XDR according to whether you want to query hot storage (default) or cold storage. For more information, see XQL Language Structure.

Datasets

The standard, built-in data source that is available in every Cortex XDR instance is the xdr_data dataset. This is a very large dataset with many hundreds of available fields. See the Cortex XDR XQL Schema Reference for information about this dataset. Cortex Query Language (XQL) supports using different languages for dataset and field names. In addition, the dataset formats supported are dependent on the data retention offerings available in Cortex XDR according to whether you want to query hot storage (default) or cold storage. For more information, see XQL Language Structure.

This dataset is comprised of both raw EDR events reported by the Cortex XDR agent, and of logs from different sources such as third-party logs. To help you investigate events more efficiently, Cortex XDR also stitches these logs and events together into common schemas called stories. These stories are available using the Cortex XDR Presets.

You use the dataset keyword to specify a dataset on your query.

You can create a custom dataset using the Target stage.

Depending on your integrations, you might also have the following datasets available for queries:

Data

Dataset

Active Directory via Cloud Identity Engine

pan_dss_raw

Note

To set up this Cloud Identity Engine (previously called Directory Sync Service (DSS)) dataset, you need to set up a Cloud Identity Engine. Otherwise, you will not have a pan_dss_raw dataset. For more information, see Set Up Cloud Identity Engine.

Alerts table in Cortex XDR

alerts

Note

  • INFO alerts are not included in this dataset.

  • The alert fields included in this dataset are limited to certain fields available in the API. For the full list, see Get Alerts Multi-Events v2 API.

Amazon S3

  • Audit logs

    • All logs: aws_s3_raw

    • Normalize and enrich audit logs: cloud_audit_logs

  • Generic logs

    • <Vendor>_<Product>_raw

  • Network flow logs

    • All logs: aws_s3_raw

    • Normalize and enrich flow logs: xdr_dataset dataset with a preset called network_story

Authentication logs (subset of xdr_data)

Authentication logs, such as Okta: auth_logs

Note

The fields contained in this dataset are a subset of the fields in the xdr_data dataset.

AWS CloudTrail and Amazon CloudWatch

<Vendor>_<Product>_raw

Azure Event Hub

  • All logs: MSFT_Azure_raw

  • Normalize and enrich audit logs: cloud_audit_logs

Azure Network Watcher

  • All logs: MSFT_Azure_raw

  • Normalize and enrich flow logs: xdr_dataset dataset with a preset called network_story

BeyondTrust Privilege Management Cloud

beyondtrust_privilege_management_raw

Box

Events (admin_logs)

  • box_admin_logs_raw

Box Shield Alerts

  • box_shield_alerts_raw

Users

  • box_users_raw

Groups

  • box_groups_raw

Checkpoint FW1/VPN1

<Vendor>_<Product>_raw

Cisco ASA

Cisco ASA firewalls or Cisco AnyConnect VPN

  • cisco_asa_raw

Corelight Zeek

corelight_zeek_raw

Correlation rule executions

correlations_auditing

Cortex Data Lakes

xdr_data

Cortex XDR Collectors

panw_xdrc_raw

Cortex XDR Host Firewall enforcement events

host_firewall_events

CSV files in shared Windows directory

Custom datasets: Select from pre-existing user-created datasets or add a new dataset.

Database data (MySQL, PostgreSQL, MSSQL, and Oracle)

<Vendor>_<Product>_raw

Data ingestion health metrics

Datasets:

  • collection_auditing

  • data_ingestion_health

  • metrics_source

Dropbox

Events

  • dropbox_events_raw

Member Devices

  • dropbox_members_devices_raw

Users

  • dropbox_users_raw

Groups

  • dropbox_groups_raw

Elasticsearch Filebeat

<Vendor>_<Product>_raw

Elasticsearch Winlogbeat

<Vendor>_<Product>_raw

Note

If the vendor and product are not specified in the Winlogbeat profile’s configuration file, Cortex XDR creates a default dataset, microsoft_windows_raw.

Errors related to Parsing Rules

parsing_rules_errors

Forcepoint DLP

forcepoint_dlp_endpoint_raw

Fortinet Fortigate

<Vendor>_<Product>_raw

GlobalProtect access authentication logs

xdr_data

Note

To ensure GlobalProtect access authentication logs are sent to Cortex XDR, verify that your PANW firewall’s Log Settings for GlobalProtect has the Cortex Data Lake checkbox selected.

Google Cloud Platform (GCP) logs

  • All log types: google_cloud_logging_raw

  • Normalize and enrich audit and flow logs: cloud_audit_logs

    • Audit logs: cloud_audit_logs

    • Network flow logs: xdr_dataset dataset with a preset called network_story

Google Kubernetes Engine (GKE)

<Vendor>_<Product>_raw

Google Workspace

  • Google Chrome: google_workspace_chrome_raw

  • Admin Console: google_workspace_admin_console_raw

  • Google Chat: google_workspace_chat_raw

  • Enterprise Groups: google_workspace_enterprise_groups_raw

  • Login: google_workspace_login_raw

  • Rules: google_workspace_rules_raw

  • Google drive: google_workspace_drive_raw

  • Token: google_workspace_token_raw

  • User Accounts: google_workspace_user_accounts_raw

  • SAML: google_workspace_saml_raw

  • Alerts: google_workspace_alerts_raw

  • Emails: google_gmail_raw

Host Inventory and Vulnerability Assessment

  • Datasets

    • host_inventory

    • va_cves

    • va_endpoints

  • Presets

    • host_inventory

    • host_inventory_accessibility

    • host_inventory_applications

    • host_inventory_auto_runs

    • host_inventory_cpus

    • host_inventory_daemons

    • host_inventory_disks

    • host_inventory_drivers

    • host_inventory_endpoints

    • host_inventory_extensions

    • host_inventory_groups

    • host_inventory_kbs

    • host_inventory_mounts

    • host_inventory_services

    • host_inventory_shares

    • host_inventory_users

    • host_inventory_volumes

    • host_inventory_vss

Incidents table in Cortex XDR

incidents

JSON or text logs from third-party source over HTTP

<Vendor>_<Product>_raw

Login logs (subset of xdr_data)

Login logs, such as WEC: login_logs

Note

The fields contained in this dataset are a subset of the fields in the xdr_data dataset.

Logs from third party source over FTP, FTPS, or SFTP

<Vendor>_<Product>_raw

Microsoft Office 365

  • Microsoft Office 365 audit events from Management Activity API

    • Azure AD Activity Logs: msft_o365_azure_ad_raw

    • Exchange Online: msft_o365_exchange_online_raw

    • Sharepoint Online: msft_o365_sharepoint_online_raw

    • DLP: msft_o365_dlp_raw

    • General: msft_o365_general_raw

  • Microsoft Office 365 emails via Microsoft’s Graph API: msft_o365_emails_raw

  • Azure AD authentication events from Microsoft Graph API: msft_azure_ad_raw

  • Azure AD audit events from Microsoft Graph API: msft_azure_ad_audit_raw

  • Alerts from Microsoft Graph Security API: msft_graph_security_alerts_raw

NetFlow

  • ip_flow_ip_flow_raw (default)

  • When configured, uses the format <Vendor>_<Product>_raw

Network Share logs

<Vendor>_<Product>_raw

Okta

okta_sso_raw

OneLogin

Log collection

  • onelogin_events_raw

Directory

  • onelogin_users_raw

  • onelogin_groups_raw

  • onelogin_apps_raw

PANW EDR

xdr_data

PANW IOT Security

Alerts

  • panw_iot_security_alerts_raw

Devices

  • panw_iot_security_devices_raw

PANW NGFW

panw_ngfw_*_raw

Supports the following logs.

*These datasets use the query field names as described in the Cortex schema documentation.

PingFederate

ping_identity_pingfederate_raw

PingOne for Enterprise

pingone_sso_raw

Prisma Cloud

prisma_cloud_raw

Prisma Cloud Compute

prisma_cloud_compute_raw

Proofpoint Targeted Attack Protection

proofpoint_tap_raw

ServiceNow CMDB

A ServiceNow CMDB dataset is created for each table configured for data collection using the format servicenow_cmdb_<table name>_raw.

Salesforce.com

  • salesforce_connectedapplication_raw

  • salesforce_permissionset_raw

  • salesforce_profile_raw

  • salesforce_groupmember_raw

  • salesforce_group_raw

  • salesforce_user_raw

  • salesforce_userrole_raw

  • salesforce_document_raw

  • salesforce_contentfolder_raw

  • salesforce_attachment_raw

  • salesforce_contentdistribution_raw

  • salesforce_tenantsecuritylogin_raw

  • salesforce_useraccountteammember_raw

  • salesforce_tenantsecurityuserperm_raw

  • salesforce_account_raw

  • salesforce_audit_raw

  • salesforce_login_raw

  • salesforce_eventlogfile_raw

Syslog/CEF

<CEFVendor>_<CEFProduct>_raw

USB devices connect and disconnect events reported by the agent

xdr_data

Note

  • You can use XQL Search to query for this data and build widgets based on the xdr_data dataset or using the preset device_control.

  • To view in XQL Search these events, the Device Configuration of the endpoint profile must be set to Block. Otherwise, the USB events are not captured. The events are also captured when a group of device types are blocked on the endpoints with a permanent or temporary exception in place. For more information, see [Ingest Connect and Disconnect Events of USB Devices] in the Device Control documentation.

VPN logs (subset of xdr_data)

VPN logs, such as GlobalProtect: vpn_logs

Note

The fields contained in this dataset are a subset of the fields in the xdr_data dataset.

Windows Endpoints using Cortex XDR Forensics Add-on

  • forensics_amcache

  • forensics_application_resource_usage

  • forensics_arp_cache

  • forensics_background_activity_monitor

  • forensics_chrome_history

  • forensics_cid_size_mru

  • forensics_command_history

  • forensics_dns_cache

  • forensics_edge_anaheim_history

  • forensics_edge_spartan_history

  • forensics_event_log

  • forensics_file_access

  • forensics_file_listing

  • forensics_firefox_history

  • forensics_handles

  • forensics_hosts_file

  • forensics_internet_explorer_history

  • forensics_jumplist

  • forensics_last_visited_pidl_mru

  • forensics_log_me_in

  • forensics_net_sessions

  • forensics_network

  • forensics_network_connectivity_usage

  • forensics_network_data_usage

  • forensics_open_save_pidl_mru

  • forensics_port_listing

  • forensics_prefetch

  • forensics_process_execution

  • forensics_process_listing

  • forensics_psreadline

  • forensics_recent_files

  • forensics_recentfilecache

  • forensics_recycle_bin

  • forensics_registry

  • forensics_remote_access

  • forensics_seven_zip_folder_history

  • forensics_shellbags

  • forensics_shimcache

  • forensics_team_viewer

  • forensics_typed_paths

  • forensics_typed_urls

  • forensics_user_access_logging

  • forensics_user_assist

  • forensics_windows_activities

  • forensics_winrar_arc_history

  • forensics_word_wheel_query

Windows event logs via Cortex XDR Windows agents

microsoft_windows_raw

Windows Event Collector (WEC)

xdr_data

microsoft_windows_raw

Windows DHCP using Elasticsearch Filebeat

microsoft_dhcp_raw

Windows DNS Debug using Elasticsearch Filebeat

Raw Data

  • microsoft_dns_raw

Normalized Stories

  • xdr_data with the preset called network_story.

Workday

workday_workday_raw

Zscaler Cloud Firewall

ZIA

  • Firewall logs: zscaler_nssfwlog_raw

  • Web logs: zscalar_nssweblog_raw

ZPA

  • zscaler_zpa_raw

Note

Dataset names can use uppercase characters, but in queries dataset names are always treated as if they are lowercase. In addition, dataset names are supported using different languages, numbers (0-9), and underscores (_). Yet, underscores cannot be the first character of the name.

Upon ingestion, all fields are retained even fields with a null value. You can also use the Cortex Query Language to query parsing rules for null values.

Presets

Presets offer groupings of xdr_data fields that are useful for analyzing specific areas of network and endpoint activity. All of the fields available for a preset are also available on the larger xdr_data dataset, but by using the preset your query can run more efficiently. Presets are sorted at random by the first 1 million results found.

Two of the available presets are stories. These contain information stitched together from Cortex XDR agent events and log files to form a common schema. They are authentication_story and network_story.

You use the preset keyword to specify a dataset in your query.