Getting Data In

Why is Splunk log line breaking not working as expected for my multiline events?

tkwaller
Builder

Hello

I have some multiline events along with normal single line events in a log that is being monitored by Splunk. For some reason, I can't get the multiline event to merge as one event, it always breaks before "Date". Here's the log record that's not breaking correctly:

2015-12-03 14:16:51,099 [9892#0001/xxxxx] thread=1 priority=DEBUG app_name=xxxx log_source=Ixxxx - Response Headers=cxxxx
Vary: Accept-Encoding
Access-Control-Allow-Origin: *
Content-Encoding: 
com-xxsx-dye: _xxxx
title=wwww
Transfer-Encoding: chunked
Connection: Keep-Alive
Cache-Control: max-age=300
Content-Type: application/json
Date: Thu, 03 Dec 2015 14:16:50 GMT
Server: xxxx
X-Powered-By: xxxx
, Content={"bunch of stuff here}

This is what it looks like in Splunk:

12/3/15
8:16:51.099 AM  
2015-12-03 14:16:51,099 [9892#0001/xxx] thread=1 priority=DEBUG app_name=xxx log_source=xxx - Response Headers=cxxx
Vary: Accept-Encoding
Access-Control-Allow-Origin: *
Content-Encoding: 
com-xxx-dye: _xxx
title=wwww
Transfer-Encoding: chunked
Connection: Keep-Alive
Cache-Control: max-age=300
Content-Type: application/json

12/3/15
8:16:50.000 AM  
Date: Thu, 03 Dec 2015 14:16:50 GMT
Server: xxx
X-Powered-By: xxx
, Content={bunch of stuff here}

Here is my props.conf from the search head cluster:

[my_sourcetype]
MAX_TIMESTAMP_LOOKAHEAD = 20
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = true
TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = ^

I'm fairly certain it's breaking on "Date" due to the second timestamp in the record BUT since I specified the time format shouldn't it NOT break there?
How can I prevent it from breaking there?

Thanks for the help!

0 Karma
1 Solution

jplumsdaine22
Influencer

We usually set linemerge to false, and use line breaker before the timestamp. EG:

[sourcetype]
LINE_BREAKER = ([\n\r]+)(?=\d{4}-\d{2}-\d{2})
DATETIME_CONFIG = /etc/datetime.xml

It's always worth testing your breaks in a regex tool (for example https://www.debuggex.com/ theres loads out there), and you can always use the Add Data tool in the Splunk Web UI

View solution in original post

jplumsdaine22
Influencer

We usually set linemerge to false, and use line breaker before the timestamp. EG:

[sourcetype]
LINE_BREAKER = ([\n\r]+)(?=\d{4}-\d{2}-\d{2})
DATETIME_CONFIG = /etc/datetime.xml

It's always worth testing your breaks in a regex tool (for example https://www.debuggex.com/ theres loads out there), and you can always use the Add Data tool in the Splunk Web UI

tkwaller
Builder

So after adding the
TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = ^

and adding the props.conf app to the indexers, the logging still has not changed. 😞
This is the hardest time I've had with linebreaking, usually its really easy

0 Karma

jplumsdaine22
Influencer

run $SPLUNK_HOME/bin/splunk cmd btool props list --debug on your indexer, to make sure that the changes have been applied correctly. Also have you confirmed that the regex breaks your events properly using a regex checker as I suggested? The text in your files may not be identical to what you posted here, due to format changes.

0 Karma

tkwaller
Builder

I saw what I did
See this:

LINE_BREAKER = ([nr]+)((d{4}-d{2}-d{2}) )

More specifically this:

([nr]+)

Should be:

"([\r\n]+)"

once I changed it and updated, it is breaking correctly.

0 Karma

tkwaller
Builder

I tried this as well, I don't have "/etc/datetime.xml" so I omitted that portion
Still the same as above. I updated my props to:
[timestamp]

MAX_TIMESTAMP_LOOKAHEAD = 20
SHOULD_LINEMERGE = false
LINE_BREAKER = ([nr]+)((\d{4}-\d{2}-\d{2})  )

This should ONLY pick up the initial datetime stamp as it is the ONLY one preceeded by ([nr]+) as well as being the ONLY one that has an included space after the date and before the time that follows the \d{4}-\d{2}-\d{2} format

0 Karma

vasildavid
Path Finder

In a previous comment you noted that you pushed this to the SH cluster. Linebreaking and timestamping are index-time operations, so these props need to be included on the indexers as well.

0 Karma

tkwaller
Builder

Ah I see, thats probably the problem then. Will move the app and deploy to the indexers

0 Karma

jplumsdaine22
Influencer

It should indeed!

You could use TIME_FORMAT and TIME_PREFIX instead of DATETIME_CONFIG. And as the timestamp will always be at the start of the line you can drop MAX_TIMESTAMP_LOOKAHEAD

0 Karma

tkwaller
Builder

Ha! I originally had that there:
TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = ^

I put it back in, we'll see if that makes a difference. Currently there have been no changes in breaking

0 Karma

emiller42
Motivator

Just to reinforce this: use SHOULD_LINEMERGE=false and LINE_BREAKER= whenever possible.

0 Karma

sundareshr
Legend

See if this fixes it

[sourcetype]
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE=\d{4}-\d{2}-\d{2}

tkwaller
Builder

I made the props.conf as suggested:

[tt_integrationengine]
MAX_TIMESTAMP_LOOKAHEAD = 20
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE=\d{4}-\d{2}-\d{2}

and re-pushed the change to the SH Cluster but the search results are still the same as above

0 Karma

sundareshr
Legend

try moving your props.conf to the indexer.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...