The Cortex Query Language supports built-in datasets, custom datasets, and presets.
Every Cortex Query Language (XQL) dataset query begins by identifying a data source that the query will run against. Each data source has a unique name, and a series of fields. Your query specifies the data source, and then provides stages that identify fields of interest and perform operations against those fields.
You can query against either datasets or Presets in a dataset query. XQL supports using different languages for dataset and field names. In addition, the dataset formats supported are dependent on the data retention offerings available in Cortex XDR according to whether you want to query hot storage (default) or cold storage. For more information, see XQL Language Structure.
Datasets
The standard, built-in data source that is available in every Cortex XDR instance is the xdr_data
dataset. This is a very large dataset with many hundreds of available fields. See the Cortex XDR XQL Schema Reference for information about this dataset. Cortex Query Language (XQL) supports using different languages for dataset and field names. In addition, the dataset formats supported are dependent on the data retention offerings available in Cortex XDR according to whether you want to query hot storage (default) or cold storage. For more information, see XQL Language Structure.
This dataset is comprised of both raw EDR events reported by the Cortex XDR agent, and of logs from different sources such as third-party logs. To help you investigate events more efficiently, Cortex XDR also stitches these logs and events together into common schemas called stories. These stories are available using the Cortex XDR Presets.
You use the dataset
keyword to specify a dataset on your query.
You can create a custom dataset using the Target stage.
Depending on your integrations, you might also have the following datasets available for queries:
Data | Dataset |
---|---|
Active Directory via Cloud Identity Engine | pan_dss_raw NoteTo set up this Cloud Identity Engine (previously called Directory Sync Service (DSS)) dataset, you need to set up a Cloud Identity Engine. Otherwise, you will not have a pan_dss_raw dataset. For more information, see Set Up Cloud Identity Engine. |
Alerts table in Cortex XDR | alerts Note
|
Amazon S3 |
|
Authentication logs (subset of xdr_data) | Authentication logs, such as Okta: auth_logs NoteThe fields contained in this dataset are a subset of the fields in the |
AWS CloudTrail and Amazon CloudWatch | <Vendor>_<Product>_raw |
Azure Event Hub |
|
Azure Network Watcher |
|
BeyondTrust Privilege Management Cloud | beyondtrust_privilege_management_raw |
Box | Events (admin_logs)
Box Shield Alerts
Users
Groups
|
Checkpoint FW1/VPN1 | <Vendor>_<Product>_raw |
Cisco ASA | Cisco ASA firewalls or Cisco AnyConnect VPN
|
Collector status change audit for collection integrations, custom collectors, and marketplace collectors. | collection_auditing |
Corelight Zeek | corelight_zeek_raw |
Correlation rule executions | correlations_auditing |
Cortex Data Lakes | xdr_data |
Cortex XDR Collectors | panw_xdrc_raw |
Cortex XDR Host Firewall enforcement events | host_firewall_events |
CSV files in shared Windows directory | Custom datasets: Select from pre-existing user-created datasets or add a new dataset. |
Database data (MySQL, PostgreSQL, MSSQL, and Oracle) | <Vendor>_<Product>_raw |
Data ingestion health metrics | Datasets:
|
Dropbox | Events
Member Devices
Users
Groups
|
Elasticsearch Filebeat | <Vendor>_<Product>_raw |
Elasticsearch Winlogbeat | <Vendor>_<Product>_raw NoteIf the vendor and product are not specified in the Winlogbeat profile’s configuration file, Cortex XDR creates a default dataset, microsoft_windows_raw. |
Errors related to Parsing Rules | parsing_rules_errors |
Forcepoint DLP | forcepoint_dlp_endpoint_raw |
Fortinet Fortigate | <Vendor>_<Product>_raw |
GlobalProtect access authentication logs | xdr_data NoteTo ensure GlobalProtect access authentication logs are sent to Cortex XDR, verify that your PANW firewall’s Log Settings for GlobalProtect has the Cortex Data Lake checkbox selected. |
Google Cloud Platform (GCP) logs |
|
Google Kubernetes Engine (GKE) | <Vendor>_<Product>_raw |
Google Workspace |
|
Host Inventory and Vulnerability Assessment |
|
Incidents table in Cortex XDR | incidents |
JSON or text logs from third-party source over HTTP | <Vendor>_<Product>_raw |
Login logs (subset of xdr_data) | Login logs, such as WEC: login_logs NoteThe fields contained in this dataset are a subset of the fields in the |
Logs from third party source over FTP, FTPS, or SFTP | <Vendor>_<Product>_raw |
Microsoft 365 (email) |
|
Microsoft Office 365 |
|
NetFlow |
|
Network Share logs | <Vendor>_<Product>_raw |
Okta | okta_sso_raw |
OneLogin | Log collection
Directory
|
PANW EDR | xdr_data |
PANW IOT Security | Alerts
Devices
|
PANW NGFW | panw_ngfw_*_raw Supports the following logs.
*These datasets use the query field names as described in the Cortex schema documentation. |
PingFederate | ping_identity_pingfederate_raw |
PingOne for Enterprise | pingone_sso_raw |
Prisma Cloud | prisma_cloud_raw |
Prisma Cloud Compute | prisma_cloud_compute_raw |
Proofpoint Targeted Attack Protection | proofpoint_tap_raw |
ServiceNow CMDB | A ServiceNow CMDB dataset is created for each table configured for data collection using the format |
Salesforce.com |
|
Syslog/CEF | <CEFVendor>_<CEFProduct>_raw |
USB devices connect and disconnect events reported by the agent | xdr_data Note
|
VPN logs (subset of xdr_data) | VPN logs, such as GlobalProtect: vpn_logs NoteThe fields contained in this dataset are a subset of the fields in the |
Windows Endpoints using Cortex XDR Forensics Add-on |
|
Windows event logs via Cortex XDR Windows agents | microsoft_windows_raw |
Windows Event Collector (WEC) | xdr_data microsoft_windows_raw |
Windows DHCP using Elasticsearch Filebeat | microsoft_dhcp_raw |
Windows DNS Debug using Elasticsearch Filebeat | Raw Data
Normalized Stories
|
Workday | workday_workday_raw |
Zscaler Cloud Firewall | ZIA
ZPA
|
Note
Dataset names can use uppercase characters, but in queries dataset names are always treated as if they are lowercase. In addition, dataset names are supported using different languages, numbers (0-9
), and underscores (_
). Yet, underscores cannot be the first character of the name.
Upon ingestion, all fields are retained even fields with a null value. You can also use the Cortex Query Language to query parsing rules for null values.
Presets
Presets offer groupings of xdr_data
fields that are useful for analyzing specific areas of network and endpoint activity. All of the fields available for a preset are also available on the larger xdr_data
dataset, but by using the preset your query can run more efficiently. Presets are sorted at random by the first 1 million results found.
Two of the available presets are stories. These contain information stitched together from Cortex XDR agent events and log files to form a common schema. They are authentication_story
and network_story
.
You use the preset
keyword to specify a dataset in your query.