I'm seeing a build up of files left behind in $SPLUNK_HOME/var/spool/splunk. This location is part of the system/default inputs.conf, pulling in data in "sinkhole" mode. This should mean that the files there are loaded destructively. Furthermore, the documentation states
As long as this is set, Splunk won't keep track of indexed files. Without the "move_policy = sinkhole" setting, it won't load the files destructively and will keep a track of them.
When attempting to figure out why I still have files within the spool directory, I've issued a search on the REST endpoint:
| rest /services/admin/inputstatus/TailingProcessor:FileStatus splunk_server=<my_server> | transpose | search column="*\\var\\spool\\splunk\\*.type"
I'm seeing results indicating "finished reading" (good), but the filename mentioned is still present in the directory. In addition, I'm seeing cases where the status reads "ignored file (crc conflict, needs crcSalt)". This latter case would seem to countermand the documentation saying that Splunk won't track files in a sinkhole location.
In any event, I'd like to understand why my spool directory is filling up, both with things Splunk has indexed, as well as things it refuses to index!
Version is Splunk 4.3.4 on Windows.
Any help appreciated.
You might be experiencing bug SPL-59578, which is fixed in 4.3.6 and expected to be resolved in 5.0.3. I would recommend to upgrade to one of these versions and watch for the issue again. Let me know if I guessed correctly 🙂
Could you show us the $SPLUNK_HOME/var/spool/splunk
stanza as seen in the output of $SPLUNK_HOME/bin/splunk cmd btool inputs list --debug
?
Something else you might want to do in order to troubleshoot this issue is to bump TailingProcessor, WatchedFile and BatchReader to DEBUG log.cfg. The next time you spot a spool file that hasn't been processed, check splunkd.log and look for the affected file name there.