Splunk Search

Field Extraction (Regex) When Column Is Sometimes Absent

RMartinezDTV
Path Finder

Hi, I'm working on a Regex for field extractions of an alert log. The log has 1 line per alert in the following format:

[11/26/2013 9:13:41 AM]     Server1 LogTest: /var/log   Ok      Text Log test
[11/26/2013 9:13:36 AM]     Server1 LogTest: /var/log   Bad <......data.......> Text Log test

The difficulty comes when handling some OK statuses; you'll notice here that a 'Bad' status returns data (the relevant log lines), but an 'Ok' status returns a blank (actually 2 tabs) data section.

It seems like every regex I come up with will accidentally capture some part of Text Log test and use that as part of all of the data section when data isn't present.

Can I get some pointers on the proper regex expression? My current regex is below, and I think I've exhausted the guess and check method. 🙂

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?)\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.+?)\t(?P<test>.+?)
Tags (2)
0 Karma
1 Solution

RMartinezDTV
Path Finder

Probably tacky to accept my own answer, but here's the final result for reference:

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?):\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.*)\t(?P<test>.*)\t

This correctly matches event when a field has blank data. Adjust punctuation (\t,\s,:, and ]) as needed for your data.

View solution in original post

0 Karma

RMartinezDTV
Path Finder

Ayn, I actually read your notes here: http://answers.splunk.com/answers/67170/index-time-field-extraction about using search-time extractions....and I just learned what the difference is from the docs!

0 Karma

RMartinezDTV
Path Finder

Probably tacky to accept my own answer, but here's the final result for reference:

]\t+\s+(?P<server>.+?)\s+(?P<category>.+?):\s(?P<object>.+?)\t(?P<status>.+?)\t(?P<data>.*)\t(?P<test>.*)\t

This correctly matches event when a field has blank data. Adjust punctuation (\t,\s,:, and ]) as needed for your data.

0 Karma

kallu
Communicator

Would it work better if you change the end

(?P<status>.+?)\t(?P<data>.+?)\t(?P<test>.+?)

to

(?P<status>.+)\t(?P<data>.*)\t(?P<test>.+)

then you should match to an empty string if there is just 2 tabs in case of "Ok"? It sounds too easy and I didn't test it with Splunk, so maybe I'm missing something?

RMartinezDTV
Path Finder

Thanks! This was almost perfect. See my answer below.

0 Karma

lukejadamec
Super Champion

My mistake, a search time field extraction.

0 Karma

Ayn
Legend

DELIMS doesn't work as an index-time extraction, and index-time extractions should be avoided unless you really know what you're doing and why.

lukejadamec
Super Champion

Have you tried setting this up for search time extraction using the log delimiter and a preset series of fields?

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...