Getting Data In

How to configure Splunk to prevent parsing multiple events as a single event?

Michael
Contributor

I see a lot of Splunk Answers about multiple lined entries being broken up into separate events. I have the opposite problem: multiple events being reported as a single entry.

I have two (identically configured?) Suricata boxes logging with fast.log enabled. Using universal forwarders. On one, each alert event gets recorded as a single line (as it should), but the other is combining different alert events (with different time-stamps) into a single Splunk event. Apologies if the formatting below doesn't allow this to presented correctly, but you get the idea...example:

Time    Event
7/11/16
2:10:36.000 PM  
07/11/2016-14:10:36.353417  [**] [1:12053001:1] test jabberwocky [**] [Classification: (null)] [Priority: 3] {TCP} 10.15.9.202:2285 -> 209.135.140.78:80
07/11/2016-14:10:36.504980  [**] [1:12053001:1] test jabberwocky [**] [Classification: (null)] [Priority: 3] {TCP} 10.15.9.202:2285 -> 209.135.140.78:80

This was working before -- it just started it after re-pointing these boxes to a new indexer cluster... (yes, removed pointers to the old instance, and confirmed).

Thots?
Thanks,
Mike

0 Karma

Michael
Contributor

Finally think I have this licked...

After trying a gazillion things, including SHOULD_LINEMERGE, MUST_BREAK_AFTER, BREAK_ONLY_BEFORE, LINE_BREAKER, TIME_FORMAT... etc., etc., etc...

I took a sample of the text and started the import process into a local instance, via the GUI, on my desktop. I noticed that when I left the source type to "default" it parsed the lines correctly. But, when I changed it to "Snort" (that this Suricata data mostly resembles) it did the same goofy combining of multiple events into one event. Aha! I thought!

I changed the sourcetype in inputs.conf to "default" -- and it worked!

I then changed the sourcetype back to suricata -- and copied the "default" stanza out of the ../default/props.conf into ../local/props.conf and relabeled it to suricata.

Viola!

The bottom line is: I was making it harder than it needed to be. Default worked fine. I didn't need to do all that other mucking about in the props.conf with regex, time formats, and line break settings. I also did NOT have to edit the props.conf on my indexers (of which I have 3 in my cluster) -- like some suggested.

Thanks for all the suggestions -- and there's a ton of other posts out here asking this same question in different ways -- and 10 times that amount of people suggesting vastly differing things (that didn't work). I sometimes wish people would answer with experiences that actually worked for them, instead of leading people down rabbit holes by slapping things up against the wall to see what sticks...

Cheers,
Mike

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

I've recently come across a couple of log types that are not able to be parsed properly when the date formatting is passed in the props.conf file. One of them included 1 12-hour-AM-PM format that would only work if it was given the parameters for parsing in the props.conf file WITHOUT the timestamp format (just defaulting). The data input tool would say that the timestamp formatting was correct, but Splunk itself didn't parse the data properly, it would not use the AM/PM part of the timestamp, so all the data for 24 hours was overlaid so that it was all in a 12-hour period. I can only believe that the code inside Splunk was doing something wrong with the format string.

I would dare to guess that this is the type of problem that you were seeing as well. Needless to say, we need to remember that sometimes there are actually bugs in the Splunk code base. Sorry Splunk. 😞 I'm not going to abandon you, though. There was a bug in about 3 versions of Splunk that we kept waiting for the supposed fix to be implemented correctly (and we had been told about the bug and the fix early on). The feature had been working, then it wasn't for 3 versions, then they finally, really, actually fixed the bug. 🙂

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

Having seen your additional information I would suspect that the change in the sourcetype is a likely cause of the difference. Do you have a props.conf file on your indexer(s) that defines how this sourcetype is supposed to be parsed? I would use something like this:

[suricata]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TIME_FORMAT=%m/%d/%Y-%H:%M:%S.%N
MAX_TIMESTAMP_LOOKAHEAD=32

This should read the date correctly, and it should break on each line. If you don't have a props.conf file that defines things for this sourcetype, you could look for the old sourcetype on your old server. If this doesn't work, then I would look for another props.conf file that is causing the file to be mis-interpreted.

Michael
Contributor

Sorry, had to take back my "accepted answer"...

Wow, still working this issue (coming back around to it). Discovered after the fact that the http.log and tls.log are still exhibiting this behavior, and it is coming from both Suricata systems. props.conf did get pushed to indexers, tried multiple settings for the LOOKAHEAD.

Did notice that the timestamps seem to be getting rounded off...

For example these lines:
09/14/2016-10:44:30.066431 stuff redacted
09/14/2016-10:44:30.145560 stuff redacted
09/14/2016-10:44:30.153100 stuff redacted

Will all get reported as:
09/14/2016-10:44:30.000 AM

0 Karma

Michael
Contributor

I originally made the changes on the search head (master) but a co-admin noted that you said to do it on the indexers. Ha! Reading is fundamental 😕 .

Changes made, tested successfully. Thanks!

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

If you have an index cluster, it might be best to deploy using the cluster master. That is how I do it. If you have questions about that, go ahead and ask. 🙂

0 Karma

Michael
Contributor

Ya, that's how we did it. Changes made to master, pushed to indexers (Distribute Configuration Bundle).

Thanks!

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Does the sourcetype definition (event processing rules) for this log has also be migrated to new cluster ? Could you validate if the log format has changes and those configurations are still valid?

0 Karma

Michael
Contributor

We still have the old server up, I've confirmed the formats of the logs have not changed. It was feeding into a single instance before, now feeding into forwarders then to a cluster of 3 indexers. Also confirmed, on the historical data on the old server shows it broke the individual events up properly.

The sourcetypes are all properly labeled as "suricata" -- that did change from the previous "suricata_alert". All the input.conf files have been cleaned of the old label on the new instance.

With my new sample posted today, you'll note that it's not event using the seconds as a delineation, there's one at 44 seconds, then two more at 45 seconds.

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

It looks like you have more than one date format in the file. Is this correct?

7/11/16
2:10:36.000 PM
07/11/2016-14:10:36.353417
0 Karma

Michael
Contributor

No, the formatting of this wiki input makes it difficult to adequately display. Here's another sample from this morning:
7/12/16
4:41:46.000 AM


07/12/2016-04:41:44.761595 [**] [1:14091903:2] test_rule [**] [Classification: (null)] [Priority: 3] {TCP} 99.6.202.199:53686 -> 123.112.112.162:8975
07/12/2016-04:41:45.566909 [**] [1:14091903:2] test_rule [**] [Classification: (null)] [Priority: 3] {TCP} 123.112.112.162:8975 -> 99.6.202.199:53686
07/12/2016-04:41:45.566959 [**] [1:14091903:2] test_rule [**] [Classification: (null)] [Priority: 3] {TCP} 99.6.202.199:53686 -> 123.112.112.162:8975

The bold is the _time extraction, the other times are the actual events (_raw) being grouped together as a multi-line event...

0 Karma

Michael
Contributor

and note that the times are getting rounded off:

For example, these lines:
09/14/2016-10:44:30.066431 stuff redacted
09/14/2016-10:44:30.145560 stuff redacted
09/14/2016-10:44:30.153100 stuff redacted

Will all get reported as one event with this timestamp:
09/14/2016-10:44:30.000 AM

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...