Getting Data In

What is the difference between LINE_BREAKER and BREAK_ONLY_BEFORE?

Madhan45
Path Finder

What is the main difference between LINE_BREAKER and BREAK_ONLY_BEFORE and what is the use of these?

0 Karma
1 Solution

acharlieh
Influencer

LINE_BREAKER and BREAK_ONLY_BEFORE are both props.conf settings, and they're used in different parts of the parsing / indexing process.

You can see a detailed chart of this on the Splunk Wiki. But LINE_BREAKER defines what ends a "line" in an input file. By default it's any number of CR and LF characters. (Depending on your format of your input, this could need to be altered for correctness, or if your log format can be separated into events by a simple regex, LINE_BREAKER can be altered to find the event boundary, and SHOULD_LINEMERGE can be set to false to skip the next step of the process).

During the next phase, Splunk takes the individual lines and combines them back together to form events. (Certain log formats may have multi-line events, especially stacktraces). BREAK_ONLY_BEFORE is one of many attributes used to determine where the event boundaries are.

View solution in original post

acharlieh
Influencer

LINE_BREAKER and BREAK_ONLY_BEFORE are both props.conf settings, and they're used in different parts of the parsing / indexing process.

You can see a detailed chart of this on the Splunk Wiki. But LINE_BREAKER defines what ends a "line" in an input file. By default it's any number of CR and LF characters. (Depending on your format of your input, this could need to be altered for correctness, or if your log format can be separated into events by a simple regex, LINE_BREAKER can be altered to find the event boundary, and SHOULD_LINEMERGE can be set to false to skip the next step of the process).

During the next phase, Splunk takes the individual lines and combines them back together to form events. (Certain log formats may have multi-line events, especially stacktraces). BREAK_ONLY_BEFORE is one of many attributes used to determine where the event boundaries are.

Madhan45
Path Finder

Thank you acharlieh

0 Karma

jonathon
Path Finder

Is there an indexing performance gain by using one over the other? For instance, JSON formatted events?

acharlieh
Influencer

In terms of parsing events, you may see some gains if you can split events with a simple LINE_BREAKER regex and SHOULD_LINEMERGE=false as you essentially skip a step. That said however, if your logs aren't able to be split with a simple enough regex, you could wind up more time than using the other aggregation settings (but that'd be something to measure as you try things out).

JSON formatted and other structured events (like CSV and IIS/W3C logs), you actually have another option to play with as well, could potentially offload the parsing to your Universal Forwarders, and eliminating search time parsing by using INDEXED_EXTRACTIONS and _json sourcetype.

Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...