Splunk Dev

Script Indexed

dgadjov
Explorer

I have a script that is collecting data and then outputs it to a directory.
The script is being run by Splunk every hour and the directory is being continuously indexed.
After the script gets the data it does a check on a temp file to make sure there is no duplicate events and only writes new events into the indexed directory.

The problem is that the data is still indexed even though the file is not being written to.
What commands in python will make splunk index the script data directly?
I do not have any print lines and the only thing I do is open and read files and output to files.

Tags (1)
0 Karma

sowings
Splunk Employee
Splunk Employee

The usual behavior of scripted inputs in Splunk is to index the STDOUT from the script. If you adapt your script to output only the new events to STDOUT, you shouldn't get that data duplication.

0 Karma

dgadjov
Explorer

I have two folders called 'temp' and 'data'. When the script runs it collects some data and then does a compare with the 'temp' folder. If there is a difference it makes a list of differences and then outputs the differences to folder 'data'. It then writes all of the collected data into 'temp' to represent the latest record. If no differences are found nothing is written to the 'temp' or 'data' folder.
'temp' is just a temp folder but 'data' is the folder that is being continuously indexed.

0 Karma

sowings
Splunk Employee
Splunk Employee

You missed my point, which was a suggestion to re-work your script expressly to write its results to STDOUT.

In any event, let's figure out why it's grabbing your events. Does Splunk know about the location where the temp files are written? Could it be indexing those files?

When you write the new outputs, are you writing the whole thing out to a new file? Reusing an existing file?

When you say "The problem is that the data is still indexed even though the file is not being written to." What file do you mean? Can you give us some sample filenames to make the scenario easier to follow?

0 Karma

dgadjov
Explorer

The issue is that I don't have STDOUT anywhere yet it is still indexing

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...