Getting Data In

What is the difference between LINE_BREAKER and BREAK_ONLY_BEFORE?

Madhan45
Path Finder

What is the main difference between LINE_BREAKER and BREAK_ONLY_BEFORE and what is the use of these?

0 Karma
1 Solution

acharlieh
Influencer

LINE_BREAKER and BREAK_ONLY_BEFORE are both props.conf settings, and they're used in different parts of the parsing / indexing process.

You can see a detailed chart of this on the Splunk Wiki. But LINE_BREAKER defines what ends a "line" in an input file. By default it's any number of CR and LF characters. (Depending on your format of your input, this could need to be altered for correctness, or if your log format can be separated into events by a simple regex, LINE_BREAKER can be altered to find the event boundary, and SHOULD_LINEMERGE can be set to false to skip the next step of the process).

During the next phase, Splunk takes the individual lines and combines them back together to form events. (Certain log formats may have multi-line events, especially stacktraces). BREAK_ONLY_BEFORE is one of many attributes used to determine where the event boundaries are.

View solution in original post

acharlieh
Influencer

LINE_BREAKER and BREAK_ONLY_BEFORE are both props.conf settings, and they're used in different parts of the parsing / indexing process.

You can see a detailed chart of this on the Splunk Wiki. But LINE_BREAKER defines what ends a "line" in an input file. By default it's any number of CR and LF characters. (Depending on your format of your input, this could need to be altered for correctness, or if your log format can be separated into events by a simple regex, LINE_BREAKER can be altered to find the event boundary, and SHOULD_LINEMERGE can be set to false to skip the next step of the process).

During the next phase, Splunk takes the individual lines and combines them back together to form events. (Certain log formats may have multi-line events, especially stacktraces). BREAK_ONLY_BEFORE is one of many attributes used to determine where the event boundaries are.

Madhan45
Path Finder

Thank you acharlieh

0 Karma

jonathon
Path Finder

Is there an indexing performance gain by using one over the other? For instance, JSON formatted events?

acharlieh
Influencer

In terms of parsing events, you may see some gains if you can split events with a simple LINE_BREAKER regex and SHOULD_LINEMERGE=false as you essentially skip a step. That said however, if your logs aren't able to be split with a simple enough regex, you could wind up more time than using the other aggregation settings (but that'd be something to measure as you try things out).

JSON formatted and other structured events (like CSV and IIS/W3C logs), you actually have another option to play with as well, could potentially offload the parsing to your Universal Forwarders, and eliminating search time parsing by using INDEXED_EXTRACTIONS and _json sourcetype.

Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...