We recently installed the Splunk on Splunk app on our search head and the TA-sos-win app on the Indexer. The SOS panel for Indexer CPU usage is at 100% almost always, while the panel for Search head CPU usage is at 2-5%.
When I log into the actual servers, and open the Task Manager, the CPU utilization for both the servers is between 10-30%.
Why is there such a big mismatch in the two values? The SOS panel is probably right in identifying the high load, since we have been observing slow indexing speeds on the Indexer. What could be the reason for high cpu usage on the Indexer? Any suggestions to bring it down?
I believe this discrepancy is caused by two major differences in which CPU usage is measured in S.o.S vs the Windows task manager:
Secondly, the ps_sos.ps1
scripted input that collects per-process resource usage measures CPU usage from WMI with the following method:
$pctCPU = get-wmiobject Win32_PerfFormattedData_PerfProc_Process -Filter "IDProcess = $myPID" | select -expand PercentProcessorTime
It is my understanding that this WMI counter measures "immediate" CPU usage, which tends to make these measurements spiky. The Windows task manager most certainly computes CPU usage using a decaying average method, which smooths things out.
I believe this discrepancy is caused by two major differences in which CPU usage is measured in S.o.S vs the Windows task manager:
Secondly, the ps_sos.ps1
scripted input that collects per-process resource usage measures CPU usage from WMI with the following method:
$pctCPU = get-wmiobject Win32_PerfFormattedData_PerfProc_Process -Filter "IDProcess = $myPID" | select -expand PercentProcessorTime
It is my understanding that this WMI counter measures "immediate" CPU usage, which tends to make these measurements spiky. The Windows task manager most certainly computes CPU usage using a decaying average method, which smooths things out.
Thanks for the excellent answer.
So if my S.o.S. panel shows 100% CPU usage, is that to be expected? Or is there anything that can be done to bring down those numbers?
It is somewhat expected for certain Splunk processes to use one or more full CPU cores while they run. Search processes, for example, are often CPU-bound and therefore run at one full CPU core / 100% for most of their run time.