I love Splunk's ability to dynamically pull fields at runtime with name=value pairs.
I have several log formats which have a "key name: value" format or similar. The exact settings are:
There may potentially be spaces in both the key name and value
There are always two spaces between the colon and the value
The value may be blank
Each set of values is separated by three or four spaces
Subject: Security ID: S-1-5-18 Account Name: ACCOUNT$ Account Domain: DOMAIN Logon ID: 0x3e7 Process Information: New Process ID: 0x1bdc Another Field: Test Results negative
The regex I'm trying to use for my extraction is
\s\s\s([^\s]+):\s\s(.*)\s\s\s
Where am I going wrong?
Use a regex that extracts the key and the value in transforms.conf.
[with_colon]
REGEX = \t([^\s:]+):\s(\S+)\t
FORMAT = $1::$2
I suppose the problem if the key or value may contain spaces is, how do you tell when one end and other begins, e.g., if you see:
one two: three four five: six seven eight nine: ten eleven: twelve
What would be the field names and what would be the values? Apparently one two
is a field, but is its value three
or three four
? And so on.
If you can define that, then a regex can be created, but the idea is kind of what is in Ayn's answer.
There should be a tab character between each set of fields so appending and prepending to the regex from Ayn should do the trick. I'll try it out and see how it goes.
Use a regex that extracts the key and the value in transforms.conf.
[with_colon]
REGEX = \t([^\s:]+):\s(\S+)\t
FORMAT = $1::$2
Ayn, many thanks. I've done some additional research on the message format and updated the question accordingly. Could you take a look at what I need in the way of a regex?
True, I missed that in the question! Editing to reflect on that, and the info that tabs are at the start and end of each k/v pair.
well, the field name may include spaces.