Solved: Separate combined log entries at search time

vbumgarn · ‎05-20-2010

I have an ongoing problem that I hope just goes away when I upgrade completely to v4. My current setup is v3 Forwarders sending data to v3 indexer, which is storing and forwarding all results to a v4 indexer.

Every once in a while, logs end up indexed with multiple events crammed together, ignoring the BREAK_ONLY_BEFORE pattern. I of course cannot reproduce the problem outside of production. I'm sure it is something to do with overflowing some buffer somewhere.

My props.conf looks like this:

[rm3]
BREAK_ONLY_BEFORE=20[0-9][0-9]-\d+-\d+\s+\d+:\d+:\d+,\d+\s+
pulldown_type = true
AUTO_TAG = false
KV_MODE = none
MAX_TIMESTAMP_LOOKAHEAD = 25
MAX_EVENTS = 512
AUTO_LINEMERGE = false

Anyway, I'm hoping that in the short term, there is some command that can split up results based on some pattern at search time. In this case, I want to break on ^2010. If this doesn't exist, I'll make a command for it, I was just hoping something already exists.

Thanks, Vincent

Stephen_Sorkin · ‎08-21-2010

There's no good way to do this at search time since field extraction is run before you'd have a chance to do anything meaningful to the events. A technique like this can be used to split separate lines into separate results, but it's filled with problems:

... | rex mode=sed "s/\n/NL/g" | eval raw=_raw | makemv raw delim="NL" | mvexpand raw | eval _raw = raw

Some of the big problems are:

If separate lines have separate timestamps, all expanded results will still have the same timestamp as the first line.
Every line will inherit the fields extracted from the parent event, most likely those from the first line.
The results aren't treated as events but rather as results, so they won't show up properly in flashtimeline.

View solution in original post

Stephen_Sorkin · ‎08-21-2010

There's no good way to do this at search time since field extraction is run before you'd have a chance to do anything meaningful to the events. A technique like this can be used to split separate lines into separate results, but it's filled with problems:

... | rex mode=sed "s/\n/NL/g" | eval raw=_raw | makemv raw delim="NL" | mvexpand raw | eval _raw = raw

Some of the big problems are:

If separate lines have separate timestamps, all expanded results will still have the same timestamp as the first line.
Every line will inherit the fields extracted from the parent event, most likely those from the first line.
The results aren't treated as events but rather as results, so they won't show up properly in flashtimeline.

vbumgarn · ‎08-23-2010

That would work. I wrote a "split" command to do effectively the same thing, and it has exactly the same problems.

Deep suggested LINE_BREAKER instead of BREAK_ONLY_BEFORE. We're going to try that and see if the problem goes away.

Separate combined log entries at search time

Introducing the Splunk Community Dashboard Challenge!

Wondering How to Build Resiliency in the Cloud?

Updated Data Management and AWS GDI Inventory in Splunk Observability