Splunk Newbie here. I understand Splunk can purge records from the it's own repository using buckets and parameter settings in seconds, but can Splunk remove files from the source after they have been "read"/"copied"? If not, I'll need to create a job to remove source files manually, but how would I know which files have been safely copied over to the Splunk repository in order to remove them?
Ok, I'll forward the answers to the sys admin so he can test. This seems a bit cryptic to me, so we'll have to test to see if it continuously removes files from the source after it made it to the Splunk repository.
When using batch mode you may also want to enable TCP acknowledgements so that the data isn't purged from source until the receiver (either intermediate forwarder or indexer) acknowledges receipt of the data..
yes you can by modifying the stanza in your inputs.conf monitor://
by batch://
.
[batch://<path>]
* One time, destructive input of files in <path>.
* For continuous, non-destructive inputs of files, use monitor instead.
# Additional attributes:
move_policy = sinkhole
* IMPORTANT: This attribute/value pair is required. You *must* include "move_policy = sinkhole" when defining batch
inputs.
* This loads the file destructively.
* Do not use the batch input type for files you do not want to consume destructively.
host_regex = see MONITOR, above.
host_segment = see MONITOR, above.
crcSalt = see MONITOR, above.
# IMPORTANT: The following attribute is not used by batch:
# source = <string>
followSymlink = [true|false]
* Works similarly to monitor, but will not delete files after following a symlink out of the monitored directory.
# The following settings work identically as for [monitor::] stanzas, documented above
host_regex = <regular expression>
host_segment = <integer>
crcSalt = <string>
recursive = [true|false]
whitelist = <regular expression>
blacklist = <regular expression>