We always see some failures in our logs. But when we have an issue, the number of failures goes thru the roof. I'm trying to combine all the failures types and the threshold we've specified into a single search. I can create a bunch of individual searches/alerts, but I'd really like to combine them.
Here's a single/working one:
index=foo "FailureReason=24403" earliest=-30m | stats count by host | where count >20
Here's where I attempted to combine two
index=foo earliest=-30m | stats count by host where count(FailureReason=24403) >20 OR count(FailureReason=22056) > 500
But obviously something is wrong with my search/syntax. Can anyone help please?
Thanks
Hi,
the following one should work:
index=foo earliest=-30m
| stats count(eval(FailureReason=24403)) AS fr_24403, count(eval(FailureReason=22056)) AS fr_22056 by host
| where fr_24403>20 AND fr_22056>500
As you can see stats command can do pretty neat tricks but its syntax is not really straightorward. more info here: stats extended examples
Hope it helps,
regards
Sorry guys, neither of the above work. One user recommended that I try:
index=foo earliest=-30m
| rex field=_raw "FailureReason\=(?[\d]+)"
| search FailureReason=*
| stats count by host, FailureReason
| where (FailureReason=24403 and count > 20) OR (FailureReason=22056 and count > 500)
Which is the only one so far that works
Try this
index=foo (FailureReason=24403 OR FailureReason=22056) earliest=-30m | chart count over host by FailureReason | where 24403>20 OR 22056>500
Hi,
the following one should work:
index=foo earliest=-30m
| stats count(eval(FailureReason=24403)) AS fr_24403, count(eval(FailureReason=22056)) AS fr_22056 by host
| where fr_24403>20 AND fr_22056>500
As you can see stats command can do pretty neat tricks but its syntax is not really straightorward. more info here: stats extended examples
Hope it helps,
regards