Hi There,
I am trying to parse a log that an application is dumping in the windows event log. the issue is that the extraction made by the standard WinEvent transforms does not consider that there is more than 1 word before and after the delimiter =
example of the log:
20140504153716.000000
Category=0
CategoryString=NULL
EventCode=104
EventIdentifier=-2147483544
EventType=2
Logfile=Application
RecordNumber=106325
SourceName=TestApp
TimeGenerated=20140504123716.000000-000
TimeWritten=20140504123716.000000-000
Type=Warning
User=DOM\admin_app
ComputerName=SPE.DOM.org
wmi_type=WinEventLog:Application
Message=A container violation has been found
Date/time of event = 2014-05-04 15:37:16
Event Severity Level = Warning
File name = \\FS\UNC\Groups\OTS\MAD\File_Test.xlsx
File status = NOT REPAIRED
Component name = TEMP_FILE_01810668
Component disposition = NOT REPAIRED
Container Violation = Encrypted container
Client SID = S-1-5-21-3273526520-21644477317230-11231323
Client Computer = TestComp1
Client IP = 10.12.13.15
Scan Duration (sec) = 0.000
Connect Duration (sec) = 0.031
Server IP address = 10.56.68.45
Uptime (in seconds) = 518661
The extraction doesn't take all before and after the delimiter i.e.
"status = NOT" instead of "File Status = NOT REPAIRED"
Transforms looks like this:
[wel-message]
REGEX = (?sm)^(?<_pre_msg>.+)\nMessage=(?
CLEAN_KEYS = false
[wel-eq-kv]
SOURCE_KEY = _pre_msg
DELIMS = "\n","="
MV_ADD = true
[wel-col-kv]
SOURCE_KEY = Message
REGEX = \n([^:\n\r]+):[ \t]++([^\n]*)
FORMAT = $1::$2
MV_ADD = true
Props looks like this:
[TestWinMV]
BREAK_ONLY_BEFORE=^\d+
TIME_FORMAT=%Y%m%d%H%M%S.%3N
BREAK_ONLY_BEFORE_DATE=false
REPORT-MESSAGE = wel-message, wel-eq-kv, wel-col-kv
Thanks in advance,
Naor
It isn't totally clear to me what you're trying to do with wel-col-kv and wel-message, and those are both one of the elements that are complicating things.
Also your props.conf isn't quite going to reliably produce your multiline event with the timestamp you expect.
Try something like this
[TestWinMV]
BREAK_ONLY_BEFORE=^\d+
TIME_FORMAT=%Y%m%d%H%M%S.%3N
MAX_TIMESTAMP_LOOKAHEAD = 15
SHOULD_LINEMERGE = true
As for the transforms...It looks like you are trying to grab all lines before the Message key (but you will only get the one field before) and call it _pre_msg and call everything to the end of the line "message" but you're not using that key... later you use "Message" which in your sample doesn't help describe what you're trying to do exactly.
You're grabbing the field before Message which in this case does have a value with a colon delim - but when you try to transform the (sub) key value, you've got a space and a tab in your regex (which don't exist in the key:value). In addition, you're using _pre_msg as the SOURCE_KEY and yet all the fields with spaces are actually AFTER the Message= anchor... which means you're not going to grab them as it is:
so for the moment I'll disregard that and stick to your actual question:
Question: How to handle spaces in key and value.
Answer: Delim is not enough... make your own regex
Start with this (get rid of the other transforms to start)
[wel-eq-kv]
REGEX = ([^=]+)\s?=\s?([^\n]+)
FORMAT = $1::$2
MV_ADD = true
SOURCE_KEY = _raw
Grab anything not an =(that's the key) possibly encounter a space or not, then an = possibly encounter a space or not, then grab anything not a newline (that's the value)
CLEAN_KEYS defaults to true, so Splunk will automagically replace spaces with underscores in the keys.
In the end, however... if this is a home grown application I'd have a chat with the developer and have them clean up the log file. Spaces in keys... not nice.
It isn't totally clear to me what you're trying to do with wel-col-kv and wel-message, and those are both one of the elements that are complicating things.
Also your props.conf isn't quite going to reliably produce your multiline event with the timestamp you expect.
Try something like this
[TestWinMV]
BREAK_ONLY_BEFORE=^\d+
TIME_FORMAT=%Y%m%d%H%M%S.%3N
MAX_TIMESTAMP_LOOKAHEAD = 15
SHOULD_LINEMERGE = true
As for the transforms...It looks like you are trying to grab all lines before the Message key (but you will only get the one field before) and call it _pre_msg and call everything to the end of the line "message" but you're not using that key... later you use "Message" which in your sample doesn't help describe what you're trying to do exactly.
You're grabbing the field before Message which in this case does have a value with a colon delim - but when you try to transform the (sub) key value, you've got a space and a tab in your regex (which don't exist in the key:value). In addition, you're using _pre_msg as the SOURCE_KEY and yet all the fields with spaces are actually AFTER the Message= anchor... which means you're not going to grab them as it is:
so for the moment I'll disregard that and stick to your actual question:
Question: How to handle spaces in key and value.
Answer: Delim is not enough... make your own regex
Start with this (get rid of the other transforms to start)
[wel-eq-kv]
REGEX = ([^=]+)\s?=\s?([^\n]+)
FORMAT = $1::$2
MV_ADD = true
SOURCE_KEY = _raw
Grab anything not an =(that's the key) possibly encounter a space or not, then an = possibly encounter a space or not, then grab anything not a newline (that's the value)
CLEAN_KEYS defaults to true, so Splunk will automagically replace spaces with underscores in the keys.
In the end, however... if this is a home grown application I'd have a chat with the developer and have them clean up the log file. Spaces in keys... not nice.
Thanks!!!
It worked like magic!
So you could try replacing the wel-col-kv with something like:
REGEX=(?m)^([^=\n]+)\s*=\s*([^\n]+)
FORMAT = $1::$2
MV_ADD = true
If you really wanted you could set the source_key to message if you don't want to create fields above "Message"