Getting Data In

Data being auto-indexed as .tmp file instead of .csv

katzr
Path Finder

Hello,

I have an auto-index set up on a folder in my splunk directory and the past two times when a user copied their data in the .csv form into the folder- this was indexed as .tmp file. How can I fix this problem and ensure .tmp files are not auto-indexed?

The .tmp file was indexed and the actual .csv never got indexed. I deleted the .tmp source type data out of splunk and I deleted the source file out of the directory, renamed it and copied it back over and the data still didn't get indexed this way. I ended up having to just manually upload the file

0 Karma
1 Solution

woodcock
Esteemed Legend

The reason that it did not index it after you fixed it is because by default, Splunk does not consider the file name as uniquely identifying a source (because many systems rotate logs in place and to not do so would mean whenever a log file got rotated to a backup name, it would get indexed again). So Splunk considers /your/path/to/file_foo.csv to be the same file as /your/path/to/file_bar.tmp as long as the first X bytes and last Y bytes match. You can change this behaviour by setting crcSalt=<SOURCE> (yes, use literally that exact string) in your inputs.conf:

http://docs.splunk.com/Documentation/Splunk/latest/admin/Inputsconf

View solution in original post

lfedak_splunk
Splunk Employee
Splunk Employee

Hey @katzr! If @woodcock or @richgalloway solved your problem, please don't forget to accept an answer! You can upvote posts as well. (Karma points will be awarded for either action.) Happy Splunking!

0 Karma

woodcock
Esteemed Legend

The reason that it did not index it after you fixed it is because by default, Splunk does not consider the file name as uniquely identifying a source (because many systems rotate logs in place and to not do so would mean whenever a log file got rotated to a backup name, it would get indexed again). So Splunk considers /your/path/to/file_foo.csv to be the same file as /your/path/to/file_bar.tmp as long as the first X bytes and last Y bytes match. You can change this behaviour by setting crcSalt=<SOURCE> (yes, use literally that exact string) in your inputs.conf:

http://docs.splunk.com/Documentation/Splunk/latest/admin/Inputsconf

richgalloway
SplunkTrust
SplunkTrust

Change your inputs.conf file to add a whitelist attribute to your monitor stanza. Something like whitelist = \.csv$ should limit Splunk to CSV files.

---
If this reply helps you, Karma would be appreciated.

katzr
Path Finder

can I do that for just one specific index though?

0 Karma

woodcock
Esteemed Legend

You do it in inputs.conf for each [monitor://...] stanza.

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...