OK, so I've got some weirdness going on with KPIs in ITSI.
I have a set of alerting data coming that only gives me a record each time there is a state change. So I have to do some jigging around to get the last valid record and do a count of Critical / Warnings.
My search goes like this;
tag=Geneos_Severity_Alerts
| eval host=if($data.row.probe$="GW Data",replace($data.row.cell$,"_INF / probeStatus",""),$data.row.probe$)
| stats last(host) as host last(data.row.severity) as severity last(data.row.NAR-ID) as NARID last(operation) as operation by data.name
| search operation!=delete
| eval critical=if(severity="Critical",1,0)
| eval warning=if(severity="Warning",1,0)
I've checked this out ad-hoc, and all picks up the right set of data, and gives me a straightforward 1 or 0 to use in a sum to get counts on a KPI. The host represents the entities I have on the Services for filtering.
The weirdness is this; I've set up the KPI Base search using the above. Then applying it to a Service, I simply get an incorrect result. Same timespan, same filtering, same calculation. If I go into a deep dive, I can see the correct result by Entity. If I open the search there — correct result. If I flip the KPI setup against the Service to Ad-Hoc search — which then just uses the KPI base search long-hand, without me touching it — correct result.
Base Search: Nope. No dice. 2+1+1 apparently = 9. Just wrong
Now I have had this search (and variations of it) working OK, but as we're in dev, I've had to delete out and recreate a bunch of Services / Entities.
Is this possibly a hangover of old data?
One possible problem with your base KPI search is that you have stripped _time from the results (by using 'stats'); ITSI needs _time to do its thing.
You could correct this in several ways, including adding 'latest(_time) as _time' to your stats function.
Hi , only got a chance to try this yesterday. Appears to have done the trick - though it hadn't helped that I had confused last and latest in what was needed in the stats function
Hi @nickmew
Did the answer below solve your problem? If so, please resolve this post by approving it! If your problem is still not solved, keep us updated so that someone else can help ya. Thanks for posting!
One possible problem with your base KPI search is that you have stripped _time from the results (by using 'stats'); ITSI needs _time to do its thing.
You could correct this in several ways, including adding 'latest(_time) as _time' to your stats function.