Extract a URL indicator from text that is recognized from a regular expression and then formatted with a formatting script.
The Cortex XSIAM URL indicator type is built using regular expression and a formatting script. The following describes the URL extraction components and what output you should expect when extracting URL indicators.
URL extraction components
There are two components when extracting URL indicators:
Regular expression
Formatting script
Regular expression
From a given text, a URL regular expression tries to catch a valid URL based on the following characteristics:
A URL prefixed by one of the following protocols:
HTTP
HTTPS
FTP
FTPS
HXXP (defanged HTTP)
HXXPS (defanged HTTPS)
A URL with ASCII or non-ASCII characters
Escaped and unescaped URLs
URL with or without query parameters
Formatting script
After extracting the URL using regular expression, a FormatURL
formatting script iterates on each given URL and does the following:
If the URL is prefixed by a URL defense system, Proofpoint or ATP, the script extracts the redirected URL and continues with steps 3-6 for the original and extracted redirected URL.
If the URL is NOT prefixed by a URL defense system, Proofpoint or ATP, the script checks if the first query parameter is a redirected URL query parameter by checking if the first parameter value starts with HTTP or HTTPS.
For example:
https://www.good.site/index.html?redirectURL=https://evil.com/mal.html
If the query parameter exists, the script extracts the redirected URL and performs steps 3-6 both for the given URL and the one extracted from the query parameter.
Replaces "[.]" with "." .
For example:
https://www[.]example.com
becomeshttps://www.example.com
Decodes the URL.
For example:
https://www.example.com%2F%21%40
becomeshttps://www.example.com/!@
Converts obfuscated characters.
For example:
hxxp → http
becomeshxxps → https
Returns the formatted URL.
Common URL structures
The following are the most common supported URL structures:
http://öevil.tld/
https://evilö.tld/evil.html
www.evilö.tld/evil.aspx
https://www.evöl.tld/
www.evil.tld/resource
http://xn--e1v2i3l4.tld/evilagain.aspx
https://www.xn--e1v2i3l4.tld
hxxps://www.xn--e1v2i3l4.tld
hxxp://www.xn--e1v2i3l4.tld
www.evil.tld:443/path/to/resource.html
https://1.2.3.4/path/to/resource.html
1.2.3.4/path
1.2.3.4/path/to/resource.html
http://1.2.3.4:8080/
http://1.2.3.4:8080/resource.html
http://☺.evil.tld/
http://1.2.3.4
ftp://foo.bar/resource
ftps://foo.bar/resource
For more information, see Indicator Extraction.