I am trying to parse Weblogic records with a sourcetype of weblogic_stdout, but some of the logged events have multiple timestamps that aren't getting parsed separately. For example, the following lines get combined into one event.
...
[2017-07-17 14:16:04,212] DEBUG: [[ACTIVE] ExecuteThread: '2' for queue: ...
[2017-07-17 14:16:04,212] DEBUG: [[ACTIVE] ExecuteThread: '2' for queue: ...
[2017-07-17 14:16:04,213] DEBUG: [[ACTIVE] ExecuteThread: '2' for queue: ...
[2017-07-17 14:16:04,216] DEBUG: [[ACTIVE] ExecuteThread: '2' for queue: ...
[2017-07-17 14:16:04,217] DEBUG: [[ACTIVE] ExecuteThread: '2' for queue: ...
[2017-07-17 14:16:04,218] DEBUG: [[ACTIVE] ExecuteThread: '2' for queue: ...
[2017-07-17 14:16:04,220] DEBUG: [[ACTIVE] ExecuteThread: '2' for queue: ...
[2017-07-17 14:16:04,220] DEBUG: [[ACTIVE] ExecuteThread: '2' for queue: ...
I'm assuming this is a problem with the TIME_FORMAT
string in props.conf, but am not sure how to handle the different strings that start with [YYYY-MM-DD
versus the the ones that start with ...
I don't currently have a TIME_FORMAT
specified, but don't know how to set it to recognize both formats.
Is there a better way to split these apart? My users say this messes up their dashboards...
I only see a single time format in your example logs so just use TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
and TIME_PREFIX = \[
For dealing with multiple time formats, see this article. It is probably all you need:
https://www.splunk.com/blog/2014/04/23/its-that-time-again.html
Separating the events may still be a problem, but without seeing the other formats of the events, it's hard to give you any direction.
If I am reading it correctly; the problem is not multiple timestamp formats in one source, it is improper line-breaking based where rows are being merged together.
It seems to be a combination of the line breaks and the datatime XML config. I've discovered the log has three different timestamp formats. Examples are:
<Aug 10, 2017 1:50:23 PM EDT>
[2017-08-10 13:50:23,105]
13:50:23.251
My props.conf for this is:
[weblogic_stdout]
DATETIME_CONFIG = /etc/system/weblogic_stdout.xml
LINE_BREAKER = ([\r\n]+)(\[\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\,\d{3}\]|\<\w{3}\s\d{1,2}\,\s\d{4}\s\d{1,2}:\d{2}:\d{2}\s[AP]M\s\w{3,}\>|\d{2}:\d{2}:\d{2}\.\d{3}\s)
SHOULD_LINEMERGE = false
And the following for the datetime defines:
This is working correctly for the first two timestamp formats, breaking the events at the desired timestamps, and picking up the correct date/time for each event in the search results. The third timestamp format, however, inherits the timestamp from the preceding event in the search results. Can anyone shed light on what's wrong with my "_weblogic_stdout_timestamp3" extract specification or the regex string? I'm assuming it's there, but it may be how the information is getting passed along via LINE_BREAKER.
Sorry about my datetime defines. I used a Code Sample, and it seems to have eaten the surrounding XML. Fortunately I noticed the trailing > on the third match string, and it now properly extracts the timestamp for those events as well.
My thanks to the as-always helpful Splunk community!
And here is a link that woodcock has provided recently about this:
https://www.function1.com/2013/01/oh-no-splunking-log-files-with-multiple-formats-no-problem
It seems my primary timestamp didn't post properly. It's enclosed in angle brackets, and formatted as month name, day of month, comma, year, and so on.