[operlog]
LINE_BREAKER = (?m)(.\d{7}.\d\d:\d\d:\d\d.\d\d)
SHOULD_LINEMERGE = false
Why do my events have the text that I specified in my line_breaker removed?
Are my parens wrong? Should I add a different command?
In a nutshell I want my line break to happen when the weird date format shows up. But I want the date format to be in the event.
Thanks,
Paul
One way to think about this is: LINE_BREAKER "defines" the line-break characters. The "line breaks" (defined by the regex capture group) are removed. This is the correct behavior for the LINE_BREAKER. See the Splunk docs on line breaking
I think that you probably want BREAK_ONLY_BEFORE
BREAK_ONLY_BEFORE = (.\d{7}.\d\d:\d\d:\d\d.\d\d)
or maybe MUST_BREAK_AFTER
When using LINE_BREAKER
you have a regular expression in up to three parts:
LINE_BREAKER = the previous event end (the data between events) the new event here
That is to say only the part in the capturing group is removed. So to break only on newlines followed by digits as per your pattern:
LINE_BREAKER = ([\r\n]+).\d{7}.\d\d:\d\d:\d\d.\d\d
This seems to cause some confusion, but using LINE_BREAKER
(with SHOULD_LINEMERGE = false
) is my preferred method as it only requires remembering one thing, and covers most cases in a quick and simple way.
Thank you - this was exactly what I was trying to figure out. I had a regex and couldn't figure out why only part of it was disappearing. I didn't understand about the capture group.
the 'd' for digit does not seem correct to me, try '\d'
when I posted it there was a slash before the d, so try '\d'
One way to think about this is: LINE_BREAKER "defines" the line-break characters. The "line breaks" (defined by the regex capture group) are removed. This is the correct behavior for the LINE_BREAKER. See the Splunk docs on line breaking
I think that you probably want BREAK_ONLY_BEFORE
BREAK_ONLY_BEFORE = (.\d{7}.\d\d:\d\d:\d\d.\d\d)
or maybe MUST_BREAK_AFTER
Further, BREAK_ONLY_BEFORE
(and MUST_BREAK_AFTER
) only require that you supply a string that uniquely appears in the first line (or last line) of the event - the regular expression is unanchored.
LINE_BREAKER
requires a regular expression that is anchored both at the end of the last event line and the beginning of the first event line.
It may be faster to use LINE_BREAKER
, but what good is that if the regular expression is wrong?
You are right that Mike's comment is correct and I was unclear. The capture portion of the regular expression is the only part that is removed when you use LINE_BREAKER
.
Perhaps LINE_BREAKER
is preferred for people who know regular expressions. In my experience, the number of people who can write a proper LINE_BREAKER
regular expression is quite small. The manual actually says that LINE_BREAKER
"might increase your indexing speed, but is somewhat more difficult to work with."
LINE_BREAKER is the preferred method. Please see Mike's post below.
Thanks lguinn! I didn't see the relationship to the \r\n and the removal of them to create an event. Once I saw that the MUST_BREAK_AFTER made perfect sense.
Thanks again.
Paul
Thanks Lowell - I have edited my answer to correct the missing backslashes!
Paul, note that the backslashes before the "d"s were removed. The text formatting is a bit messed up unfortunately. 😞 The suggestion here is right on!