We utilize splunk to forward log files written out by PM2 (a node.js process manager) to our Splunk indexers. PM2 has its own logrotate features, and creates backup log files when it reaches its settings. These log files are also in the same folder, and we are indexing *.log. We DO want this data to be evaluated, because there may be a time that the forwarders are down and we don't want to miss anything that may have been logged.
Example:
prog.log
prog_2018-01-03.log
prog_2018-01-02.log
prog_2018-01-01.log
In the above scenario, how do we keep things that have been indexed in prog.log from becoming indexed when the file is written out as prog_date.log? Keeping in mind that we do want to ensure we dont miss any entries for outages, and want to continue to process the dated logs as a backup.
We just upgraded to splunkforwarder 7.0.4, since we were under the impression it would assist with this, but we are still seeing the same results.
Generally Splunk should not ingest a file's content which is renamed, if it has already read it, but it does when you use crcSalt =<SOURCE>
in inputs.conf stanza. Could you share full stanza from inputs.conf of your forwarder using which your file is being monitored?
Yeah, I've got that, but I'd added it for another reason. Here's the stanza in question:
[monitor:///var/log/mservices/]
sourcetype = microservices_log
index = mservices
disabled = false
blacklist = .(bz2|gz)$
crcSalt = < source >
Is there something else i could use for the crcSalt that would alleviate this issue?
What was the reason you added crcSalt?
IT was done many months ago by a member of my team who is no longer here. I think it was to prevent double indexing within the main log file itself, if Im not mistaken.
The official usage/description of crcSalt=<SOURCE>
is this.
The crcSalt attribute, when set to <SOURCE>, ensures that each file has a unique CRC. The effect of this setting is that Splunk Enterprise assumes that each path name contains unique content.
So when your monitoring stanza, because of wildcard, includes both regular logs and rolled logs, you shouldn't be using crcSalt=<SOURCE>
. Does your file contains some sort of headers, as the first few lines of your file?
No, no headers. Just log lines. When thinking on this more, there may have been some lines being split or something similar that caused us to add the crcsalt. I read the definition and have already removed the crcsalt line, so now i just need to wait a day or two and see if some other weird issue raises its head.
I appreciate the assistance, and hope that this solves everything.