Splunk Search

Multiline event not get breaking properly in middle of indexing

arunloganathan
New Member

i am indexing .dat file which contains more than 5000 events.
in the middle 1 or 2 events breaked wrongly
This the config i used

Props.conf
NO_BINARY_CHECK = true
BREAK_ONLY_BEFORE = ^\d{1,11}\s?,(([^\,]+)?\,?.?),(([^\,]+)?\,?.?)
MAX_TIMESTAMP_LOOKAHEAD = 100
TIME_FORMAT = %Y%m%d%H%M%S%6N
TIME_PREFIX = ^(?:[^,\n]*,){7}
disabled = false
pulldown_type = true

inputs.conf

[monitor:///xxxx]
disabled = false
whitelist=*.dat
time_before_close = 120
multiline_event_extra_waittime = true
index = xxxx
sourcetype = yyyy

Actual Events
00000000000,,xxxx,40673673,19.08.2016,14:00,21:00,20160818070100184759,/ablive/data/yyyy/serial/yyyy/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./xxxx201608180700060000.csv,xxxx201608180700060000.csv,26,c2038af5-5b95-4532-bfa2-e2fa54d8a29e,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-11b7,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1232,2016-08-18T07:01:50.679Z,2016-08-18T07:01:52.994Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS

00000000000,,xxxx,40667760,19.08.2016,17:00,21:00,20160818070100167747,/ablive/data/yyyy/serial/yyyy/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./xxxx201608180700060000.csv,xxxx201608180700060000.csv,24,854f6e61-bf00-4914-9799-c539eb30be81,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1023,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1066,2016-08-18T07:01:46.089Z,2016-08-18T07:01:49.160Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS

Indexed Events

e,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-11b7,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1232,2016-08-18T07:01:50.679Z,2016-08-18T07:01:52.994Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS

60,19.08.2016,17:00,21:00,20160818070100167747,/ablive/data/yyyy/serial/yyyy/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./xxxx201608180700060000.csv,xxxx201608180700060000.csv,24,854f6e61-bf00-4914-9799-c539eb30be81,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1023,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1066,2016-08-18T07:01:46.089Z,2016-08-18T07:01:49.160Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS

00000000000,,xxxx,40673673,19.08.2016,14:00,21:00,20160818070100184759,/ablive/data/yyyy/serial/yyyy/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./xxxx201608180700060000.csv,xxxx201608180700060000.csv,26,c2038af5-5b95-4532-bfa2-e2fa54d8a29

00000000000,,xxxx,406677

Indextimings

indextime source count
2016-08-18 07:01:49 xxxx 2162
2016-08-18 07:01:52 xxxx 2
2016-08-18 07:01:53 xxxx 2137
2016-08-18 07:01:56 xxxx 2
2016-08-18 07:01:58 xxxx 1266

same file indexed in above mentioned time and count 2 contains splitted events.
I used time_before_close and multiline_event_extra_waittime=true even though 1 or 2 events get splitted.

Thanks in advance.

Tags (1)
0 Karma

michael_sleep
Communicator

That actually all looks good. I was going to suggest that possibly an EOF was causing Splunk to split the event. I've had something similar happen before. I think a good test would be taking that log file (the one with 5000 events) uploading it directly to your indexer through the GUI with the "Add Data" feature. Configure everything the same and see if the event is still breaking weird in the middle. I use this method sometimes if it seems like it should be working based on the config. If it works there then it means its something else.

0 Karma

arunloganathan
New Member

I tired indexing data using GUI. There is no issue in line breaking. This line break issue not happened every day . It happens randomly one day 1 event get splitted another day 2 events but not more than 2 events. No issue in some days

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

Have you checked splunkd.log for any error messages relating to the LineBreakingProcessor?

Also, if your events always start with 00000000000, why don't you simplify your props.conf setting to BREAK_ONLY_BEFORE=^00000000000?

0 Karma

arunloganathan
New Member

events are not always start with 00000000000. It will have random numbers like 07548521430

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

OK, so what is the pattern then? 1-11 digits, followed by a comma?
If so, you could still simplify it by using BREAK_ONLY_BEFORE=^\d{1,11},
I suspect your line breaking issues stem from an overly complex RegEx, so I would try to use the simplest expression that matches the beginning of your events.

Did you check splunkd.log for any warning/error messages that may provide a hint as to what may be going on? You may also run into default limits as to total event length and/or maximum number of lines per multi-line event.

0 Karma

arunloganathan
New Member

i checked splunkd.log there is no error or warning events.

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...