Splunk Search

How can I extract breakable_text properly?

ew09
New Member

Hi everyone,

I have many logs in the following format as an example


Timestamp: 6/27/2016 8:40:25 PM
Message: Matcher: Record not matched for example record
Category: Business
Priority: 10
EventId: 1
Severity: Information
Title:Information
Machine: MachineName
App Domain: Test.exe
ProcessId: 5400
Process Name: E:\Test\Test.exe
Thread Name:
Win32 ThreadId:2600
Extended Properties:

I want to be able to grab all the information after the field name, for example I want a field called Machine, and have the data be 'MachineName.' By using a \n as a delimiter I can split all of the lines up, however I do not want the field name itself. Any advice on what regex to create to do this? Thanks!

0 Karma
1 Solution

javiergn
Super Champion

Another approach using a little bit of regex to get exactly what you need.
You can obviously ignore the first lines that I used to replicate your use case:

| stats count | fields - count
| eval _raw = "
Timestamp: 6/27/2016 8:40:25 PM
Message: Matcher: Record not matched for example record
Category: Business
Priority: 10
EventId: 1
Severity: Information
Title:Information
Machine: MachineName
App Domain: Test.exe
ProcessId: 5400
Process Name: E:\Test\Test.exe
Thread Name: 
Win32 ThreadId:2600
Extended Properties:
"
| rex field=_raw max_match=0 "(?msi)^(?<keyvalue>[^:]+:\s?([^\n]+)?)$"
| mvexpand keyvalue
| rex field=keyvalue "(?i)^(?<key>[^:]+):\s?((?<value>[^\n]+)$)?"
| fillnull value value="NULL"
| fields - keyvalue
| eval {key}=value
| fields - key, value
| stats first(*) as * by _raw

Output: see picture

alt text

View solution in original post

0 Karma

javiergn
Super Champion

Another approach using a little bit of regex to get exactly what you need.
You can obviously ignore the first lines that I used to replicate your use case:

| stats count | fields - count
| eval _raw = "
Timestamp: 6/27/2016 8:40:25 PM
Message: Matcher: Record not matched for example record
Category: Business
Priority: 10
EventId: 1
Severity: Information
Title:Information
Machine: MachineName
App Domain: Test.exe
ProcessId: 5400
Process Name: E:\Test\Test.exe
Thread Name: 
Win32 ThreadId:2600
Extended Properties:
"
| rex field=_raw max_match=0 "(?msi)^(?<keyvalue>[^:]+:\s?([^\n]+)?)$"
| mvexpand keyvalue
| rex field=keyvalue "(?i)^(?<key>[^:]+):\s?((?<value>[^\n]+)$)?"
| fillnull value value="NULL"
| fields - keyvalue
| eval {key}=value
| fields - key, value
| stats first(*) as * by _raw

Output: see picture

alt text

0 Karma

javiergn
Super Champion

You could try with the extract command at search time:

your search here
| extract pairdelim="\n", kvdelim=":"

Or (better approach probably) try adding the relevant stanzas in props and transforms to extract what you want:

http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Createandmaintainsearch-timefieldextrac...

0 Karma

ew09
New Member

Thanks for the input. The only issue I run into is on the Message line. There are usually at least two colons, is there a way I can take everything after the Message field even if that character is a colon? Basically, I want to take everything after a colon until it hits a newline

0 Karma

javiergn
Super Champion

see my other answer below.
Hope it helps

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...