Trying out TA-nmon (1.3.13). I have it deployed on 3 Linux machines, two UFs (Universal Forwarders), and one SH (Search Head).
After one day of operation, one of the UFs stopped reporting some data. We're seeing messages like this from sourcetype nmon_processing (the exact section change from minute to minute)
CONFIG section: will not be extracted (time delta of 240 seconds is inferior to 1 hour)
CPUnn section: Wrote 2 lines
ERROR: hostname: kzoldapd3 :CPU_ALL section is not consistent: Detected anomalies in events timestamp, dropping this section to prevent data inconsistency
WARN: hostname: kzoldapd3 :PROC section data is not consistent: the data header could not be identified, dropping the section to prevent data inconsistency
VM section: Wrote 1 lines
UPTIME section: Wrote 1 lines
WARN: hostname: kzoldapd3 :PROCCOUNT section data is not consistent: the data header could not be identified, dropping the section to prevent data inconsistency
WARN: hostname: kzoldapd3 :TOP section data is not consistent: the data header could not be identified, dropping the section to prevent data inconsistency
Any idea of the cause?
What is the workaround?
Hello !
Right, this happens because for some unexpected reasons the header definition for these section are not found.
That's weird and should not happen.
If you kill the running nmon process on the forwarder, this should solve the current issue (pkill nmon)
Can you please tell me:
- The Operating System and version
- if the nmon processing is using Python or Perl (in the nmon processing event you will have either the Perl or Python version)
- before killing the nmon process, if you:
ls -ltr /opt/splunkforwarder/var/log/nmon/var/nmon_repository/fifo*/nmon_header.dat
These file should contain the current headers
Thank you
I had already restarted splunk and stopped the nmon processes before I received this. I will follow these instructions and post the results if the problem recurs.
I did notice two sets of nmon processes running when I was cleaning things up....
Right, thank you.
This might be related with the parallel run that occurs a few minutes before the end of the current process.
Could you tell me the Python version of this server please ?
python -V
Python 2.6.9
Thank you.
Did the problem appears again ?