Splunk Search

Extracting a field at search time - rex question

jstockamp
Communicator

I've looked at the splunk documentation but can't make sense of it, maybe it's too early int he morning. I'm having a problem extracting a field at search time.

I'm going through some web logs and I've got a field called referer. It's got values in in it like

http://www.mysite12.com
http://www.1234.org
http://wkjew23.ajkda.com/abc?1234
http://1254.splunk.com/Test

What I'd like to do is create a field that is just the domain name (i.e. just mysite12.com, 1234.org, etc.). I believe the correct regex to use is "\w*(.com|.net|.org)"

How do I extract this field in my search. I've used

rex field=referer "(?<refer_domain>)\w*(.com|.net|.org)"

But that doesn't seem to work. I'm unclear where/how I specify the field name for the extraction.

Tags (2)
1 Solution

hazekamp
Builder

jstockamp, In general your ?<field> goes inside a capture group. The regular expression below might be a bit better for you.

Updated: Proper escaping of slashes:

rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

For testing:

index=_internal | stats count | eval count="http://www.mysite12.com/" | rename count as referer | rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

View solution in original post

jstockamp
Communicator

Hmmm, that errors out. Here's my complete search command:

eventtype = "evt_all" | eval refer_domain = (coalesce(sc_Referer_, referer_domain)) | rex field=refer_domain "(?:https?|ftp|gopher|telnet|file|notes|ms-help):(?:(?://)|(?:\\\\))(?<refer>.*?)(?:[/\\]|$)" | table refer_domain, refer

and the error is

Error in 'rex' command: Encountered the following error while compiling the regex '(?:https?|ftp|gopher|telnet|file|notes|ms-help):(?:(?://)|(?:\\))(?<refer>.*?)(?:[/\]|$)': Regex: missing terminating ] for character class
0 Karma

hazekamp
Builder

Simple issue w/ escaping slashes. See updated rex above; also w/ search to test

0 Karma

hazekamp
Builder

jstockamp, In general your ?<field> goes inside a capture group. The regular expression below might be a bit better for you.

Updated: Proper escaping of slashes:

rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

For testing:

index=_internal | stats count | eval count="http://www.mysite12.com/" | rename count as referer | rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

jstockamp
Communicator

Thanks, after the edit this works great.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...