Hi All,
I'm new to using regex, and I've recently made some changes that were pushed to our Splunk production which I'm (unfortunately) unable to see.
I'm hoping one will be able to give me feedback on a stanza I've created to verify that I'm using it correctly.
Here is the portion of the log I'd like to extract:
08/19/2013 10:51:31 PM - Process(x) User(y) Program(z)
Host(linux.host.com)
ABC1234: String.
EXPLANATION
…Basically everything before “EXPLANATION”, and then the next log would continue as the same.
Here is the stanza I've created:
[props_name]
LINE_BREAKER = ([\r\n]+))
BREAK_ONLY_BEFORE = "("+EXPLANATION+")"+"\n"
SHOULD_LINEMERGE = false
TIME_PREFIX = ^
TIME_FORMAT = %m/%d/%Y_%H:%M:%S %p
MAX_TIMESTAMP_LOOKAHEAD = 23
EXTRACT-props_name = ^.+\s+(?<Process>[^\s]+)\s+(?<User>[^\s]+)\s+(?<Program>[^\s]+)\s+(?<Host>[^\s]+)\s+(?<ABC[^0-9][0-9]{4}[^0-9]>[^\s%]+)
If it’s not too much trouble or too time-consuming, I’d greatly appreciate feedback on this.
Thank You!
It's a good idea to test your regexes using something like for instance regexpal - http://www.regexpal.com/ - or RegExr - http://gskinner.com/RegExr/ .
In your case the sample log you show supposedly starts with a timestamp which makes your extraction fail - but maybe the timestamp isn't really part of the log, just something that you included from Splunk's own output?
That timestamp is actually part of the log. I've come across many instances where I've used the MAX_TIMESTAMP_LOOKAHEAD and other time-like attributes that have been great. Being that these logs are more human-friendly than computer-friendly (ex name=value format where Splunk can auto extract the values) using regex to extract would be a great help, but I'm not too sure I've used it properly. I'll take a look at the link you gave me...Thanks!