Is there a way that splunk can take into account receiving no value as a zero value, and then have the ‘average’ function take advantage of that?
when I am using the ‘sparkline’ (avg(cpu)) function, we are seeing that if a process fired 1 time in 60 minutes, leveraging a 60 second polling interval, for 80%, it’s average is displayed in splunk as 80%, which means splunk is not taking into account the other 59 '0’s' for data points.
In the example above, the proper math would be 80/60=1.33333. Can this be done in splunk via a setting / function / argument?
I spoke with my SE, and so it works like this.
Since there is only 1 data point in the image above, the process was only available for that 1 interval. since the process didn't report, even by name, splunk will not force a null value, or any value.
The way to do this is to manually set the integer manually. (polling interval per minute * minutes) will give you how many data points you have, and then perform the averaging manually. meaning something like this:
| stats sum(pctCPU) as cpu first(_time) as first last(_time) as last | eval itv=ceil((first - last) / 60) | eval cpu=(cpu / itv)
This will give you the manual average of a 60 minute period, if your polling interval is 60 seconds. It's painful, and get's more complicated if you start adding PID's and users, but this may get you started. if I find more, i'll update this question.
I spoke with my SE, and so it works like this.
Since there is only 1 data point in the image above, the process was only available for that 1 interval. since the process didn't report, even by name, splunk will not force a null value, or any value.
The way to do this is to manually set the integer manually. (polling interval per minute * minutes) will give you how many data points you have, and then perform the averaging manually. meaning something like this:
| stats sum(pctCPU) as cpu first(_time) as first last(_time) as last | eval itv=ceil((first - last) / 60) | eval cpu=(cpu / itv)
This will give you the manual average of a 60 minute period, if your polling interval is 60 seconds. It's painful, and get's more complicated if you start adding PID's and users, but this may get you started. if I find more, i'll update this question.
I think there is a difference between a NULL value and EMPTY value. May be we have empty values instead of NULL. So try to use another test on perc different from isnull( ), may be isnum(...) can go.
It doesn't look like it. The value that the 'top' input of the linux TA for the pctCPU field is "0.0". I tried both of these suggestions as you mentioned, and this ended up the same. here is the search:
index=myIndex sourcetype=top USER=* PID=* USER!=root `linux_hostname` | lookup myLookup.csv nix_host as hostname | search hostname=* Environment="*" | eval pctCPU=if(ifnull(pctCPU),0, pctCPU) | eval perc=(pctCPU/(cores * 100)*100) | stats sparkline(avg(perc)) as pctCPUTime avg(perc) as percCPUTime sparkline(avg(pctMEM)) as pctMEM avg(pctMEM) as percMEM by USER hostname Function COMMAND PID | where percCPUTime>0 | sort - percCPUTime | eval percCPUTime=round(percCPUTime,2) | eval percCPUTime=if(percCPUTime > 100,"100",percCPUTime) | eval percCPUTime=(percCPUTime + "%") | eval percMEM=round(percMEM,2) | eval percMEM=(percMEM + "%") | head 25
I still end up with results like this:
Maybe I'm simply not understanding how splunk is averaging values, but it seems like it's not averaging by all data points.
Hi @tmarlette
Another option you can try before using the avg
function is the fillnull
command to fill all null values for the field cpu
with zero values. http://docs.splunk.com/Documentation/Splunk/6.2.2/SearchReference/Fillnull
Ex:
your base search... | fillnull value=0 cpu | stats avg(cpu)
you can do this by doing an eval to set null values as 0 should look something like this:
[search terms here] | eval somefield=if(isnotnull(extracted_field), extracted_field, 0) | stat avg(somefield)
I've tried this, however I'm still experiencing the same problem.
Here is my search now:
index=myIndex sourcetype=top USER!=root linux_hostname
| search hostname=* Environment="" USER="" | eval perc=(pctCPU/(cores * 100)*100) | eval perc=if(isnotnull(perc), perc, 0) | stats sparkline(max(perc)) as pctCPUTime avg(perc) as avg by COMMAND hostname Function | sort - percCPUTime | eval percCPUTime=if(percCPUTime > 100,"100",percCPUTime) | eval percCPUTime=round(percCPUTime,2) | eval percCPUTime=(percCPUTime + "%") | head 25
This is what I am attempting to prevent within the result set:
Notice that there is only 1 data point that is 24, and the actual value of the average is 23.80, which means splunk is using that value, and no others to average. as this is an hour, at a 60 second polling interval, this should be using 60 points to average. meaning 24 /60 data points (be it that the other 59 are 0, we can do this simple math)
That equals .4. This does NOT equal 24, unless splunk simply doesn't recognize zero as a valid value. Is there anyway to do this appropriately?