Monitoring Splunk

Is *nix sourcetype=ps' pctCPU really suitable for charting OOTB?

Paolo_Prigione
Builder

Hi all, I am reasoning about the *nix app sourcetype=ps' pctCPU metric and how to plot it correctly.

I see Splunk's nix app generally plots it by doing *... | timechart avg(pctCPU) by ...**. This would be perfectly fine if pctCPU expressed the instantaneous usage of cpu (as top does). Instead, per "man ps" definition (RHEL 5.5):

 %cpu       %CPU     cpu utilization of the process in "##.#" format. Currently, it is the CPU time used divided by the time the process has been running (cputime/realtime ratio), expressed as a percentage. It will not add up to 100% unless you are lucky. (alias pcpu).

pctCPU expresses the average cpu used by the process since its startup (cpu time / total run time)!

Say the process has long been running with low usage, then it has a burst for some minutes, then usage drops again. Ps' pctCPU would not reflect this behaviour as the total cpu time over which it has been computed did not increment that much with respect to total runtime. pctCPU is smoothed in this case.

Does my reasoning make any sense to you?

I have a quite complex solution under work which involves computing deltas of CPUTIME and ELAPSED (splunk's ps.sh definitions) for any multikv'ed ps execution, then compute "instantaneous" pctCPU, then average and plot it.
However, this is a fairly slow (requires to use | sort +host +PID +COMMAND +_time with all its limits) and complex solution.

Has anybody came up with something better?

Tags (3)

yuanliu
SplunkTrust
SplunkTrust

A simplistic answer would be to use sourcetype=top instead. I have the same problem, but my use case requires stats by fields only available in sourcetype=ps. Therefore the simplistic answer wouldn't suffice. The unfortunate use of the same field name pctCPU in these two sources to mean very different things has prompted my new question https://answers.splunk.com/answers/318807/how-to-cherry-pick-values-from-different-sources.html.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

A more sophisticated solution to this problem is posted in the above question.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...