Getting Data In

BREAK_ONLY_BEFORE failed, setting TIME_FORMAT solved the problem

yahooku
Explorer

Hi, so I've been trying to split falsely merged (separate) events:

10:42:08  Checkpoint Completed:  duration was 0 seconds.
10:42:08  Checkpoint loguniq 4227, logpos 0x4ca7018, timestamp: 0x7f8d03be
10:42:08  Maximum server connections 1414 

An obvious thing to do is to use BREAK_ONLY_BEFORE attribute - or is it? So here's what I tried in /local/props.conf

[host::some_host_name]
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE = ^/d/d:/d/d:/d/d

Surprisingly this didn't work. Needless to say I've tried countles variations of BREAK_ONLY_BEFORE and tried othe attributes. Finally I tried the TIME_FORMAT attribute:

[host::some_host_name]
SHOULD_LINEMERGE = True
TIME_FORMAT = %H:%M:%S

...and it worked like a charm. Can someone explain why this worked while the latter didn't? And how should the proper BRAK\ONLY_BRFORE atrribute look like for this to work? I didn't find anything satysfying on the forums.

0 Karma
1 Solution

kristian_kolb
Ultra Champion

Well,

First of all, I would not recommend you to use [host::your_host] configuration stanzas in your props.conf file, since the rules would then apply to all events coming from this host, regardless of the format of the event/timestamp. It's much more logical to use the [your_sourcetype] style of configuration, since rules are then applied based the type of data coming in, rather than from where it originated.

Secondly, why use SHOULD_LINEMERGE=true, if the events are single-line? This may be one of the reasons for your problems - Splunk tries to find a full timestamp (including date), and has to merge several lines to find some characters it think fits.

Thirdly, though this may seem a bit redundant, is that your regex for BREAK_ONLY_BEFORE have forward slashes rather than backslashes.

My suggestion is that you use the following instead;

[your_sourcetype]
SHOULD_LINEMERGE=false
MAX_TIMESTAMP_LOOKAHEAD=8
TIME_FORMAT=%H:%M:%S

UPDATE:

if you also have multiline messages, then you could/should still not line_merge;

[your_sourcetype]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\d\d:\d\d:\d\d\s
MAX_TIMESTAMP_LOOKAHEAD = 8
TIME_FORMAT = %H:%M:%S

Hope this helps,

Kristian

View solution in original post

kristian_kolb
Ultra Champion

Well,

First of all, I would not recommend you to use [host::your_host] configuration stanzas in your props.conf file, since the rules would then apply to all events coming from this host, regardless of the format of the event/timestamp. It's much more logical to use the [your_sourcetype] style of configuration, since rules are then applied based the type of data coming in, rather than from where it originated.

Secondly, why use SHOULD_LINEMERGE=true, if the events are single-line? This may be one of the reasons for your problems - Splunk tries to find a full timestamp (including date), and has to merge several lines to find some characters it think fits.

Thirdly, though this may seem a bit redundant, is that your regex for BREAK_ONLY_BEFORE have forward slashes rather than backslashes.

My suggestion is that you use the following instead;

[your_sourcetype]
SHOULD_LINEMERGE=false
MAX_TIMESTAMP_LOOKAHEAD=8
TIME_FORMAT=%H:%M:%S

UPDATE:

if you also have multiline messages, then you could/should still not line_merge;

[your_sourcetype]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\d\d:\d\d:\d\d\s
MAX_TIMESTAMP_LOOKAHEAD = 8
TIME_FORMAT = %H:%M:%S

Hope this helps,

Kristian

yahooku
Explorer

Ok, thanks for clearing this out.

0 Karma

kristian_kolb
Ultra Champion

Updated answer above. And while Ayn has a point, it is still a fact that LINE_BREAKER is more efficient than the combination of SHOULD_LINEMERGE and BREAK_ONLY... directives.

/k

Ayn
Legend

As Splunk by default breaks events when it encounters a valid timestamp (as defined by the BREAK_ONLY_BEFORE_DATE configuration parameter), improper line breaking is very often a symptom of improper timestamp parsing. So, configuring timestamp parsing correctly is a much better option than messing with other breaking directives - you get valid timestamps AND valid event breaking.

yahooku
Explorer

Thanks for a quick answear. You are right about replacing host by source identifier - I have only one source from this host, but still this is not a good thing to do.

About the SHOULD_LINEMERGE. Not all events are single-lined, only those which were merged together.

And about the regex. Sorry for this mistake, just coppied a result of a some desparate attempt to make this work. I'm sure I also tried the right regex - I checked it in a text editor when the BREAK_ONLY_BEFORE dind't seem to work.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...