The *nix app has a cpu by process search that doesn't work under certain conditions:
index="os" sourcetype="ps" host="$host$" | multikv fields pctCPU, COMMAND | timechart avg(pctCPU) by COMMAND
The problem is that if there are multiple processes running with the same command name in a single event, this will average them. So 5 x foo processes, each consuming 3% cpu returns foo=3% when it's actually 15%.
I fixed this by combining COMMAND with PID making it unique:
index="os" sourcetype="ps" host=$host$| multikv fields pctCPU, COMMAND, PID| strcat COMMAND "_" PID cmd | where pctCPU>0 | timechart avg(pctCPU) by cmd limit=0
but this is messy for systems with 50+ processes with the same COMMAND name and firefox doesn't seem to like limit=0
.
Ideally I could sum pctCPU within the event for all COMMANDS of the same name. This would result in a single line on the chart for foo that shows 15% instead of 5 x lines that show foo_$pid at 3%. Is this possible?
You're right. This might do it:
index="os" sourcetype="ps" host="$host$" | multikv fields pctCPU, COMMAND | stats sum(pctCPU) as pctCPU by _time,COMMAND | timechart avg(pctCPU) by COMMAND
i.e., sum the CPU up for each command at each measurement (i.e. that share the same _time) before you bucket and average.
You're right. This might do it:
index="os" sourcetype="ps" host="$host$" | multikv fields pctCPU, COMMAND | stats sum(pctCPU) as pctCPU by _time,COMMAND | timechart avg(pctCPU) by COMMAND
i.e., sum the CPU up for each command at each measurement (i.e. that share the same _time) before you bucket and average.
that did it, so simple. Thank you kindly, much appreciated.