Splunk Search

Multiline event not get breaking properly in middle of indexing

arunloganathan
New Member

i am indexing .dat file which contains more than 5000 events.
in the middle 1 or 2 events breaked wrongly
This the config i used

Props.conf
NO_BINARY_CHECK = true
BREAK_ONLY_BEFORE = ^\d{1,11}\s?,(([^\,]+)?\,?.?),(([^\,]+)?\,?.?)
MAX_TIMESTAMP_LOOKAHEAD = 100
TIME_FORMAT = %Y%m%d%H%M%S%6N
TIME_PREFIX = ^(?:[^,\n]*,){7}
disabled = false
pulldown_type = true

inputs.conf

[monitor:///xxxx]
disabled = false
whitelist=*.dat
time_before_close = 120
multiline_event_extra_waittime = true
index = xxxx
sourcetype = yyyy

Actual Events
00000000000,,xxxx,40673673,19.08.2016,14:00,21:00,20160818070100184759,/ablive/data/yyyy/serial/yyyy/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./xxxx201608180700060000.csv,xxxx201608180700060000.csv,26,c2038af5-5b95-4532-bfa2-e2fa54d8a29e,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-11b7,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1232,2016-08-18T07:01:50.679Z,2016-08-18T07:01:52.994Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS

00000000000,,xxxx,40667760,19.08.2016,17:00,21:00,20160818070100167747,/ablive/data/yyyy/serial/yyyy/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./xxxx201608180700060000.csv,xxxx201608180700060000.csv,24,854f6e61-bf00-4914-9799-c539eb30be81,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1023,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1066,2016-08-18T07:01:46.089Z,2016-08-18T07:01:49.160Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS

Indexed Events

e,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-11b7,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1232,2016-08-18T07:01:50.679Z,2016-08-18T07:01:52.994Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS

60,19.08.2016,17:00,21:00,20160818070100167747,/ablive/data/yyyy/serial/yyyy/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./xxxx201608180700060000.csv,xxxx201608180700060000.csv,24,854f6e61-bf00-4914-9799-c539eb30be81,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1023,22a301ea-26-a666-5e1b87780-ac168f26_57b54f17_2dc00d6-1066,2016-08-18T07:01:46.089Z,2016-08-18T07:01:49.160Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS

00000000000,,xxxx,40673673,19.08.2016,14:00,21:00,20160818070100184759,/ablive/data/yyyy/serial/yyyy/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./xxxx201608180700060000.csv,xxxx201608180700060000.csv,26,c2038af5-5b95-4532-bfa2-e2fa54d8a29

00000000000,,xxxx,406677

Indextimings

indextime source count
2016-08-18 07:01:49 xxxx 2162
2016-08-18 07:01:52 xxxx 2
2016-08-18 07:01:53 xxxx 2137
2016-08-18 07:01:56 xxxx 2
2016-08-18 07:01:58 xxxx 1266

same file indexed in above mentioned time and count 2 contains splitted events.
I used time_before_close and multiline_event_extra_waittime=true even though 1 or 2 events get splitted.

Thanks in advance.

Tags (1)
0 Karma

michael_sleep
Communicator

That actually all looks good. I was going to suggest that possibly an EOF was causing Splunk to split the event. I've had something similar happen before. I think a good test would be taking that log file (the one with 5000 events) uploading it directly to your indexer through the GUI with the "Add Data" feature. Configure everything the same and see if the event is still breaking weird in the middle. I use this method sometimes if it seems like it should be working based on the config. If it works there then it means its something else.

0 Karma

arunloganathan
New Member

I tired indexing data using GUI. There is no issue in line breaking. This line break issue not happened every day . It happens randomly one day 1 event get splitted another day 2 events but not more than 2 events. No issue in some days

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

Have you checked splunkd.log for any error messages relating to the LineBreakingProcessor?

Also, if your events always start with 00000000000, why don't you simplify your props.conf setting to BREAK_ONLY_BEFORE=^00000000000?

0 Karma

arunloganathan
New Member

events are not always start with 00000000000. It will have random numbers like 07548521430

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

OK, so what is the pattern then? 1-11 digits, followed by a comma?
If so, you could still simplify it by using BREAK_ONLY_BEFORE=^\d{1,11},
I suspect your line breaking issues stem from an overly complex RegEx, so I would try to use the simplest expression that matches the beginning of your events.

Did you check splunkd.log for any warning/error messages that may provide a hint as to what may be going on? You may also run into default limits as to total event length and/or maximum number of lines per multi-line event.

0 Karma

arunloganathan
New Member

i checked splunkd.log there is no error or warning events.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...