Hello,
I have an auto-index set up on a folder in my splunk directory and the past two times when a user copied their data in the .csv form into the folder- this was indexed as .tmp file. How can I fix this problem and ensure .tmp files are not auto-indexed?
The .tmp file was indexed and the actual .csv never got indexed. I deleted the .tmp source type data out of splunk and I deleted the source file out of the directory, renamed it and copied it back over and the data still didn't get indexed this way. I ended up having to just manually upload the file
The reason that it did not index it after you fixed it is because by default, Splunk does not consider the file name as uniquely identifying a source (because many systems rotate logs in place and to not do so would mean whenever a log file got rotated to a backup name, it would get indexed again). So Splunk considers /your/path/to/file_foo.csv
to be the same file as /your/path/to/file_bar.tmp
as long as the first X bytes and last Y bytes match. You can change this behaviour by setting crcSalt=<SOURCE>
(yes, use literally that exact string) in your inputs.conf
:
http://docs.splunk.com/Documentation/Splunk/latest/admin/Inputsconf
Hey @katzr! If @woodcock or @richgalloway solved your problem, please don't forget to accept an answer! You can upvote posts as well. (Karma points will be awarded for either action.) Happy Splunking!
The reason that it did not index it after you fixed it is because by default, Splunk does not consider the file name as uniquely identifying a source (because many systems rotate logs in place and to not do so would mean whenever a log file got rotated to a backup name, it would get indexed again). So Splunk considers /your/path/to/file_foo.csv
to be the same file as /your/path/to/file_bar.tmp
as long as the first X bytes and last Y bytes match. You can change this behaviour by setting crcSalt=<SOURCE>
(yes, use literally that exact string) in your inputs.conf
:
http://docs.splunk.com/Documentation/Splunk/latest/admin/Inputsconf
Change your inputs.conf file to add a whitelist
attribute to your monitor stanza. Something like whitelist = \.csv$
should limit Splunk to CSV files.
can I do that for just one specific index though?
You do it in inputs.conf
for each [monitor://...]
stanza.