Getting Data In

duplicate index entries

Branden
Builder

I'm having what appears to be a logic problem, but it could be something else.

I have an app that displays the output of an error log. Output looks something like this (may be familiar to those who have been helping me with other issues):

AA8AB241   0901122910 T O OPERATOR       OPERATOR NOTIFICATION

It parses the timestamp correctly. The error log script outputs any errors that have occurred in the past 60 seconds, and Splunk runs that every 60 seconds.

Unfortunately, it appears to be indexing it 3-5 times. In other words, for each one of those entries, I get 3-5 identical entries in the index.

I'm wondering if it has to do with my "every 60 seconds, produce results from the past 60 seconds" logic. Or could it be something else?

Any feedback is appreciated.

Thanks!

Edit:

Here is the inputs.conf file (note: this is on the forwarder):

[script://splunk/etc/apps/all/bin/errptsplunk.sh]
interval = 60  # Run every minute
sourcetype = errpt
source = script://./bin/errptsplunk.sh

And props.conf (on the indexer):

[errpt]
SHOULD_LINEMERGE = false
TIME_PREFIX = ^\S+\s+
TIME_FORMAT=%m%d%H%M%y
MAX_TIMESTAMP_LOOKAHEAD = 25
Tags (1)

gkanapathy
Splunk Employee
Splunk Employee

This isn't in a file, though, is it? You're just using a scripted input that calls the errpt shell command, and you're certain that it only gives you back new items from the past 60 seconds? And that you don't have multiple instances of the script input running in different apps or something like that?

0 Karma

Branden
Builder

Interesting point. I mentioned that to the support engineer. I'll see what they come back with. Thank you again for all your help.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Hmm, the 256 byte problem really only affects file inputs, it seems like something else entirely for script inputs.

0 Karma

Branden
Builder

Upgraded one of the forwarders just now. Problem still exists. 😞

0 Karma

Branden
Builder

Splunk support got back to me. My indexer is running 4.1.4, but my forwarders are running 4.0.9. 4.0.X and 3.X versions do not handle files/input over less than 256 bytes very well. Each errpt entry is 64 bytes. I will upgrade the forwarders next week and see what happens.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Yeah, that sounds like a bug.

0 Karma

Branden
Builder

I think I'm going to open a case with Splunk tech support on this one. Two of the errpt entries just got indexed over 10,000 times. Oops!

0 Karma

Branden
Builder

Question edited with inputs.conf and props.conf errpt stanzas. Thanks!

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

sounds a lot like a bug in Splunk scripted input, but maybe if you update the question with the complete inputs.conf stanza and props.conf config for the source/sourcetype, we can see if there's anything.

0 Karma

Branden
Builder

That's correct.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Interesting. So it was okay before you did the config change to the sourcetype to index it as individual lines, i.e., there no repeats then?

0 Karma

Branden
Builder

Correct, it is not a file; it's capturing the output of the errpt command. It should only be giving back items from the past 60 seconds... even if there was a duplicate because of my logic, I could see it indexing twice, but not 3-5 times. I'm certain the script is only running in one place.
Actually, this all started after fixing the previous issue I wrote about with the errpt command (splitting it up into individual lines). I'm not sure if that's related or not...

0 Karma

hulahoop
Splunk Employee
Splunk Employee

There is a current problem under investigation where events in very small files are getting duplicated. I'll post here when we have more details, but we've been able to reproduce this and will hopefully have a workaround soon if not a fix. It's not you or your logic. 🙂

0 Karma

Jeremiah
Motivator

Can you have your script temporarily write entries to a file as well as stdout? Then you could verify if the script was duplicating output between runs.

0 Karma

Branden
Builder

They're one line long, just like above.
It's possible they could be multiple lines if there were multiple errors within that one minute.

0 Karma

hulahoop
Splunk Employee
Splunk Employee

Branden, is this error log very small?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...