I'm attempting to index the gc.log coming from a tomcat installation, and I can't seem to get it to linemerge properly. Here is a snippet from the log:
517045.370: [GC [PSYoungGen: 516944K->3097K(519296K)] 828767K->315076K(1830016K), 0.0065230 secs]
[Times: user=0.01 sys=0.00, real=0.00 secs]
517076.460: [GC [PSYoungGen: 517337K->2116K(519232K)] 829316K->315460K(1829952K), 0.0184770 secs]
[Times: user=0.01 sys=0.00, real=0.01 secs]
and my props.conf for this sourcetype:
DATETIME_CONFIG = CURRENT
LINE_BREAKER = ([\r\n]+)(?=\d+\.\d+:)
SHOULD_LINEMERGE = FALSE
I've also tried:
DATETIME_CONFIG = CURRENT
SHOULD_LINEMERGE = TRUE
BREAK_ONLY_BEFORE = \d+\.\d+:
MUST_NOT_BREAK_BEFORE = \[
Which did not help. I verified that the above props were being applied, and not overridden by any other configurations.
The output is displaying as:
2/10/12 11:41:06.000 AM [Times: user=0.01 sys=0.00, real=0.01 secs]
2/10/12 11:40:50.000 AM 517076.460: [GC [PSYoungGen: 517337K->2116K(519232K)] 829316K->315460K(1829952K), 0.0184770 secs]
So the second line of each entry is being recorded as a newline despite my props.conf config.
My hunch here is that because the two lines are not being written to the log at the same time (There appears to be some delay between the two according to Splunk) and that I've set DATETIME_CONFIG = CURRENT it's not waiting for the second line to be written before committing the first as an event. This theory is strengthened by the fact that there is the occasional entry that is merged properly.
Is there an appropriate attribute that can control the time Splunk will wait for more lines before considering the event 'done'?
Thanks!
Not a direct answer to your question , but as an alternative to using GC log dumps , have you tried using Splunk4JMX ? I use this app for gathering detailed garbage collection and memory metrics on JVM heap and non-heap memory pools.
Your hunch is correct.
I have seen this happen for this type of log (Java garbage collection) and it is caused by the fact that the application writing the logs being tailed by Splunk is pausing mid-event for longer than the default time after which the tailing processor decides that EOF has been reached (by default, 3 seconds). This typically happens if the application is bufferizing its writes.
There are two solutions to this problem :
time_before_close
parameter in inputs.conf. Note that I would not recommend to push this value too high, and it would be best to scope the change to the affected log only.