Splunk Search

Extracting a field at search time - rex question

jstockamp
Communicator

I've looked at the splunk documentation but can't make sense of it, maybe it's too early int he morning. I'm having a problem extracting a field at search time.

I'm going through some web logs and I've got a field called referer. It's got values in in it like

http://www.mysite12.com
http://www.1234.org
http://wkjew23.ajkda.com/abc?1234
http://1254.splunk.com/Test

What I'd like to do is create a field that is just the domain name (i.e. just mysite12.com, 1234.org, etc.). I believe the correct regex to use is "\w*(.com|.net|.org)"

How do I extract this field in my search. I've used

rex field=referer "(?<refer_domain>)\w*(.com|.net|.org)"

But that doesn't seem to work. I'm unclear where/how I specify the field name for the extraction.

Tags (2)
1 Solution

hazekamp
Builder

jstockamp, In general your ?<field> goes inside a capture group. The regular expression below might be a bit better for you.

Updated: Proper escaping of slashes:

rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

For testing:

index=_internal | stats count | eval count="http://www.mysite12.com/" | rename count as referer | rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

View solution in original post

jstockamp
Communicator

Hmmm, that errors out. Here's my complete search command:

eventtype = "evt_all" | eval refer_domain = (coalesce(sc_Referer_, referer_domain)) | rex field=refer_domain "(?:https?|ftp|gopher|telnet|file|notes|ms-help):(?:(?://)|(?:\\\\))(?<refer>.*?)(?:[/\\]|$)" | table refer_domain, refer

and the error is

Error in 'rex' command: Encountered the following error while compiling the regex '(?:https?|ftp|gopher|telnet|file|notes|ms-help):(?:(?://)|(?:\\))(?<refer>.*?)(?:[/\]|$)': Regex: missing terminating ] for character class
0 Karma

hazekamp
Builder

Simple issue w/ escaping slashes. See updated rex above; also w/ search to test

0 Karma

hazekamp
Builder

jstockamp, In general your ?<field> goes inside a capture group. The regular expression below might be a bit better for you.

Updated: Proper escaping of slashes:

rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

For testing:

index=_internal | stats count | eval count="http://www.mysite12.com/" | rename count as referer | rex field=referer "(https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))(?<referer_domain>.*?)([/\\\\]|$)"

jstockamp
Communicator

Thanks, after the edit this works great.

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...