Solved: Issues Monitoring Fast Rotating Logs - UNIX

glancaster · ‎05-13-2014

Hi All,

I am running into a few errors on my host that is monitoring some logs in RHEL. One of the logs in question could write, fill up, close and rewrite again, all within a second.

A few errors in my splunkd on the host:

05-12-2014 13:25:29.087 -0700 ERROR WatchedFile - Error reading file 'LOG LOCATION': Stale NFS file handle

05-12-2014 13:25:29.087 -0700 ERROR TailingProcessor - error from read call from 'LOG LOCATION'.

05-12-2014 13:26:24.187 -0700 INFO WatchedFile - File too small to check seekcrc, probably truncated. Will re-read entire file='LOG LOCATION'
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I am running crcSalt = and am still experiencing the problem. I've looked throughout Answers but Im not sure exactly what is causing this problem, if it's a problem with the speed at which the file is written to, an issue where Splunk thinks it has already read the file or something else.

Anyone have any ideas?

Thanks in advance!

dwaddle · ‎05-13-2014

It is unlikely the crcSalt option is going to help you in this case. This sounds like a fairly classic race condition. One of the things splunk does is to stat(2) a file to see if the modtime / size has changed. If your files are completely changing in a very short period of time, then it could be changed out-from-under splunk between the stat() call and the open() call.

It probably won't work, but you can try the time_before_close option and the always_open_file options in inputs.conf. These may help (but most likely will not - race conditions are hard)

View solution in original post

dwaddle · ‎05-13-2014

It is unlikely the crcSalt option is going to help you in this case. This sounds like a fairly classic race condition. One of the things splunk does is to stat(2) a file to see if the modtime / size has changed. If your files are completely changing in a very short period of time, then it could be changed out-from-under splunk between the stat() call and the open() call.

It probably won't work, but you can try the time_before_close option and the always_open_file options in inputs.conf. These may help (but most likely will not - race conditions are hard)

amrit · ‎05-13-2014

Agreed. There's nothing you can do here other than to increase the amount of the time the file sticks around.

Issues Monitoring Fast Rotating Logs - UNIX

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!