Getting Data In

Data reindexed after UF app updated using deployment server

beaumaris
Communicator

We are using a 4.2.1 UF node to monitor a directory that contains web access log files, and send those files to an indexer. All of our nodes, including the UF and IDX nodes, have custom apps deployed to them from a central deployment server running on a Job Scheduler node. We're observing that if we update the inputs.conf file that defines the monitoring on the UF node, once the updated app is deployed to the UF (which causes splunk to restart), all of the web access log files are re-sent to to the indexer which causes duplicate events in the system.

It seems like Splunk would do the bookkeeping related to the monitoring process someplace outside of the .../splunk/etc/apps directory which is the only thing changing in the above scenario. We would not expect the files to be read again and sent again just by updating a custom app. Has anyone seen this issue and is there something we can do to prevent the files from being re-read? After all if we simply restart Splunk on the UF node the files are not read again.

Tags (1)
0 Karma

Drainy
Champion

Splunk uses something called the fishbucket (don't worry about the name, thats just what its called 🙂 ) to track what files it has read and where it has read them, so it is managed outside of the config.

Modifying the inputs to change index target or other details still won't result in the UF re-indexing the data as you would need to clean the fishbucket first.

So, firstly have a look at the splunkd.log file in the SPLUNK_HOME/var/log/splunk/ directory and see if it gives any indication of why it is indexing the files or perhaps if there are any errors related to the fishbucket.

The other option is if there is any kind of log rotation or how the logs are appended to that may be interfering with how Splunk is monitoring the log file and causes it to think the whole file has changed substantially enough to index the whole file again?

0 Karma

beaumaris
Communicator

Anyone?
This is still an issue for us and we do not expect the UF to re-send already processed logs to the indexer after an application is loaded and Splunk restarted.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...