I have multiple monitored csv files that are created every day at different times on a single server with a Universal Forwarder. Old files are deleted and completely new files are created. Each file is indexed when created and then again at 05:30 am the next day causing duplicate data.
Looking through Splunk>answers, I found where I should add the crcSalt = line in the [monitor] section of inputs.conf. I did this for one of the files and the file is still being indexed twice.
What else should I do to stop the second indexing?
I do find it interesting that the second indexing for these files happen at the same time. Is there some config that sets that time? Just wondering.
Thanks for any help provided.
Scott
You have something external to splunk that is updating the file's last modified date at 5:30 AM daily. Check your crontab, backup routines, log rotation routines, etc.
You have something external to splunk that is updating the file's last modified date at 5:30 AM daily. Check your crontab, backup routines, log rotation routines, etc.
Maybe the following can help - Logging best practices
Under Use rotation policies it speaks about - ...set up good rotation strategies ...
Wiping out the log file and starting a fresh one is not a good log practice.
These are not log files. An application creates a csv file showing itemnumber, username, logon time and logoff time for the previous day at 10:30 am. Splunk indexes the file at that time. Then at 05:30 the next day, the file is indexed again.
I have 3 other csv files created by different applications, that are created at different times but the second indexing is done at 05:30.
Fix the thing that is logging in such a silly fashion.
Could you be a little more specific? What entity should I be looking at that would be "logging in" in such a silly fashion?