All Apps and Add-ons

Why does the S.o.S. - Splunk on Splunk app show significantly different CPU usage from Windows task Manager?

nivedita_viswan
Path Finder

We recently installed the Splunk on Splunk app on our search head and the TA-sos-win app on the Indexer. The SOS panel for Indexer CPU usage is at 100% almost always, while the panel for Search head CPU usage is at 2-5%.
When I log into the actual servers, and open the Task Manager, the CPU utilization for both the servers is between 10-30%.

Why is there such a big mismatch in the two values? The SOS panel is probably right in identifying the high load, since we have been observing slow indexing speeds on the Indexer. What could be the reason for high cpu usage on the Indexer? Any suggestions to bring it down?

1 Solution

hexx
Splunk Employee
Splunk Employee

I believe this discrepancy is caused by two major differences in which CPU usage is measured in S.o.S vs the Windows task manager:

  • My guess is that in the Windows task manager you are looking at system-wide CPU usage, where "100%" represents ALL CPU resources on the system. In the Resource Usage view of S.o.S, CPU usage is expressed as "1 CPU core used = 100%". Therefore, on an 8-core machine if the main splunkd process uses 2 cores, S.o.S will show it as using 200% CPU (just as Linux's "top" command would) while the Windows task manager might show a system-wide CPU usage of 25% (2 cores used out of a total of 8).
  • Secondly, the ps_sos.ps1 scripted input that collects per-process resource usage measures CPU usage from WMI with the following method:

    $pctCPU = get-wmiobject Win32_PerfFormattedData_PerfProc_Process -Filter "IDProcess = $myPID" | select -expand PercentProcessorTime

It is my understanding that this WMI counter measures "immediate" CPU usage, which tends to make these measurements spiky. The Windows task manager most certainly computes CPU usage using a decaying average method, which smooths things out.

View solution in original post

hexx
Splunk Employee
Splunk Employee

I believe this discrepancy is caused by two major differences in which CPU usage is measured in S.o.S vs the Windows task manager:

  • My guess is that in the Windows task manager you are looking at system-wide CPU usage, where "100%" represents ALL CPU resources on the system. In the Resource Usage view of S.o.S, CPU usage is expressed as "1 CPU core used = 100%". Therefore, on an 8-core machine if the main splunkd process uses 2 cores, S.o.S will show it as using 200% CPU (just as Linux's "top" command would) while the Windows task manager might show a system-wide CPU usage of 25% (2 cores used out of a total of 8).
  • Secondly, the ps_sos.ps1 scripted input that collects per-process resource usage measures CPU usage from WMI with the following method:

    $pctCPU = get-wmiobject Win32_PerfFormattedData_PerfProc_Process -Filter "IDProcess = $myPID" | select -expand PercentProcessorTime

It is my understanding that this WMI counter measures "immediate" CPU usage, which tends to make these measurements spiky. The Windows task manager most certainly computes CPU usage using a decaying average method, which smooths things out.

nivedita_viswan
Path Finder

Thanks for the excellent answer.
So if my S.o.S. panel shows 100% CPU usage, is that to be expected? Or is there anything that can be done to bring down those numbers?

0 Karma

hexx
Splunk Employee
Splunk Employee

It is somewhat expected for certain Splunk processes to use one or more full CPU cores while they run. Search processes, for example, are often CPU-bound and therefore run at one full CPU core / 100% for most of their run time.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...