I've been running a daily saved search for EventCode=1221 (Exchange space reporting).
I noticed that one of my mail serveres sometimes goes missing, but the Event does exist in the Windows Application Logs when I go to filter by 1221. I searched Splunk forwarder logs (ie 'index=_* host="mailserver"' and found many of these errors. Splunk forwarder is 4.3.4 (latest prior to 5.0 release). It seems restarting the forwarder fixes the issue for 1 day max. Then I'm back to the same issue.
Any idea whats going on here?
ERROR WinEventLogChannel - saveCheckpointStr: Failed to rename checkpoint file 'C:\Program Files\SplunkUniversalForwarder\var\lib\splunk\persistentstorage\WinEventLog\Security_checkpoint.tmp' -> 'C:\Program Files\SplunkUniversalForwarder\var\lib\splunk\persistentstorage\WinEventLog\Security_checkpoint': Access is denied.
It has been observed in other cases that a possible antivirus scan may be holding the checkpoint file at the same time that Splunk is attempting to rename it.
Please stop the antivirus and retry.
Splunk 5.0.9 and 6.0 has new improvements targeting this particular scenarios, where the rename attempt will be retried once again at a later time.
I've noticed that upgrading the agents to v5.0.5 it is less likely to happen. Some of my busy exchange servers are consistently sending event logs.
I am interested if anything further was learned about this issue? Did you find a root cause?
ERROR WinEventLogChannel - saveCheckpointStr: Failed to rename checkpoint file 'C:\Program Files\SplunkUniversalForwarder\var\lib\splunk\persistentstorage\WinEventLog\Security_checkpoint.tmp' -> 'C:\Program Files\SplunkUniversalForwarder\var\lib\splunk\persistentstorage\WinEventLog\Security_checkpoint': Access is denied.
This appears to be a permissions issue.
5.0.1 forwarder mentions this:
INFO WinEventLogChannel - initWinEvtApi: We must be in an XP/2k3 family OS. Switching using the old Windows Event Log api: The specified module could not be found..
Upgraded to 4.3.5 forwarder and 5.0.1 forwarder same issue exists.
I think that the event log monitoring stops because we're having limited disk write performance on this server. Thoughts?