Splunk Search

How can I extract breakable_text properly?

ew09
New Member

Hi everyone,

I have many logs in the following format as an example


Timestamp: 6/27/2016 8:40:25 PM
Message: Matcher: Record not matched for example record
Category: Business
Priority: 10
EventId: 1
Severity: Information
Title:Information
Machine: MachineName
App Domain: Test.exe
ProcessId: 5400
Process Name: E:\Test\Test.exe
Thread Name:
Win32 ThreadId:2600
Extended Properties:

I want to be able to grab all the information after the field name, for example I want a field called Machine, and have the data be 'MachineName.' By using a \n as a delimiter I can split all of the lines up, however I do not want the field name itself. Any advice on what regex to create to do this? Thanks!

0 Karma
1 Solution

javiergn
Super Champion

Another approach using a little bit of regex to get exactly what you need.
You can obviously ignore the first lines that I used to replicate your use case:

| stats count | fields - count
| eval _raw = "
Timestamp: 6/27/2016 8:40:25 PM
Message: Matcher: Record not matched for example record
Category: Business
Priority: 10
EventId: 1
Severity: Information
Title:Information
Machine: MachineName
App Domain: Test.exe
ProcessId: 5400
Process Name: E:\Test\Test.exe
Thread Name: 
Win32 ThreadId:2600
Extended Properties:
"
| rex field=_raw max_match=0 "(?msi)^(?<keyvalue>[^:]+:\s?([^\n]+)?)$"
| mvexpand keyvalue
| rex field=keyvalue "(?i)^(?<key>[^:]+):\s?((?<value>[^\n]+)$)?"
| fillnull value value="NULL"
| fields - keyvalue
| eval {key}=value
| fields - key, value
| stats first(*) as * by _raw

Output: see picture

alt text

View solution in original post

0 Karma

javiergn
Super Champion

Another approach using a little bit of regex to get exactly what you need.
You can obviously ignore the first lines that I used to replicate your use case:

| stats count | fields - count
| eval _raw = "
Timestamp: 6/27/2016 8:40:25 PM
Message: Matcher: Record not matched for example record
Category: Business
Priority: 10
EventId: 1
Severity: Information
Title:Information
Machine: MachineName
App Domain: Test.exe
ProcessId: 5400
Process Name: E:\Test\Test.exe
Thread Name: 
Win32 ThreadId:2600
Extended Properties:
"
| rex field=_raw max_match=0 "(?msi)^(?<keyvalue>[^:]+:\s?([^\n]+)?)$"
| mvexpand keyvalue
| rex field=keyvalue "(?i)^(?<key>[^:]+):\s?((?<value>[^\n]+)$)?"
| fillnull value value="NULL"
| fields - keyvalue
| eval {key}=value
| fields - key, value
| stats first(*) as * by _raw

Output: see picture

alt text

0 Karma

javiergn
Super Champion

You could try with the extract command at search time:

your search here
| extract pairdelim="\n", kvdelim=":"

Or (better approach probably) try adding the relevant stanzas in props and transforms to extract what you want:

http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Createandmaintainsearch-timefieldextrac...

0 Karma

ew09
New Member

Thanks for the input. The only issue I run into is on the Message line. There are usually at least two colons, is there a way I can take everything after the Message field even if that character is a colon? Basically, I want to take everything after a colon until it hits a newline

0 Karma

javiergn
Super Champion

see my other answer below.
Hope it helps

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...