Getting Data In

Mixed test/JSON log events are splitting on a date in the JSON data

tlabue
Path Finder

I have logs coming in that are either straight text (single line) or text with a JSON string as well.

I have no issues with the straight text, but if there is additional JSON, the event breaks on an attribute with a date.

If the JSON has no additional date, it appears to be OK.

Sample log event with JSON
2018-11-28T11:25:32.876+0000 STDIO [INFO] 2018-11-28 11:25:32 [Thread-3-ESWriterBolt] DEBUG BaseBolt - {
"attribute1": 243,
"attribute2": "Standard",
"attribute3": 2018-11-28T13:11:45.3720",
"attribute4": "Y"
}

Everything up to attribute2 reads fine, however, attribute3 starts a new event, timestamped with the date value there, and going until the end, or until potentially another date field.

The current props.conf for this log type just parses a few fields and also includes TRUNCATE = 0 for no truncation of these events.

What additional to I need to setup in props.conf to make this work?

Thanks!

0 Karma
1 Solution

harsmarvania57
Ultra Champion

You can try with below configuration on Indexer OR Heavy Forwarder whichever comes first from Universal Forwarder.

props.conf

[yoursourcetype]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S.%3N%z
MAX_TIMESTAMP_LOOKAHEAD=28
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}\W\d{4}

View solution in original post

0 Karma

harsmarvania57
Ultra Champion

You can try with below configuration on Indexer OR Heavy Forwarder whichever comes first from Universal Forwarder.

props.conf

[yoursourcetype]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S.%3N%z
MAX_TIMESTAMP_LOOKAHEAD=28
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}\W\d{4}
0 Karma

FrankVl
Ultra Champion

SHOULD_LINEMERGE must be set to false when you use LINE_BREAKER.

Other than that, this should do the trick. The reason for this behavior: by default Splunk automatically detects timestamps and also assumes that is where it should break up events. Which works fine with single line events, or events that have 1 timestamp, on their first line. But for this type of events you see it doesn't behave as you want it to.

In general it is always better to define a specific LINE_BREAKER and set SHOULD_LINEMERGE to false and define explicit timestamp configuration as well (TIME_PREFIX, TIME_FORMAT, MAX_TIMESTAMP_LOOKAHEAD). This not only improves reliability of parsing, it also greatly improves the performance, as splunk doesn't have to apply all of its auto detection magic.

harsmarvania57
Ultra Champion

Thanks @FrankVI, updated original answer, didn't notice this because I was playing with only one event.

0 Karma

tlabue
Path Finder

Thanks for both your help. I had tried a LINE_BREAKER previous, but it looks like my REGEX wasn't quite correct. First indications in the development lab is that this is working.

0 Karma

harsmarvania57
Ultra Champion

Hi,

Can you please post your props.conf for above data?

0 Karma

tlabue
Path Finder

It really isn't much for the log file type:

[storm]
EXTRACT-Storm_Class_MessageType = ^[^ \n]* (?P[^ ]+)\s+[(?P\w+)
TRUNCATE = 0

The extraction is to pull some data out of the text part of the message, which is working fine.

0 Karma
Get Updates on the Splunk Community!

Database Performance Sidebar Panel Now on APM Database Query Performance & Service ...

We’ve streamlined the troubleshooting experience for database-related service issues by adding a database ...

IM Landing Page Filter - Now Available

We’ve added the capability for you to filter across the summary details on the main Infrastructure Monitoring ...

Dynamic Links from Alerts to IM Navigators - New in Observability Cloud

Splunk continues to improve the troubleshooting experience in Observability Cloud with this latest enhancement ...