Hi,
I have an event being received once every 2 minutes. I am trying to setup an alert if the Value for the event goes beyond certain threshold for 15 mins or more. I am using the below query.
index= x host = y
|Where Value > Threshold
|sort _time
|bin _time span = 16m
| stats count by host _time
|Where count > 6
|Eval count = count *2
Does the above code need any changes to work.
Thanks in advance
Some minor edits:
index= x host = y Value > Threshold # moved Value > Threshold up, you also probably want to filter to a very specific set of logs
|sort _time # why do you need the sort? Logs are already sorted _time descending by default
| bin _time span = 16m
| stats count by host, _time # added a comma for readability
|where count > 7 # shouldn't this be 7? you'd want all 8 2 minute chunks to be above the threshold
|eval count = count *2 # why do you need this line?
There are some other ways to do this (grabbing the earliest time of exceeded value, latest time, taking the diff). I would also urge you to get comfortable testing your alerts, in this case by lowering the threshold and seeing if, for example, a threshold of 0 returns the complete result set of all the hosts you would expect to see.
Hope this helps!
Some minor edits:
index= x host = y Value > Threshold # moved Value > Threshold up, you also probably want to filter to a very specific set of logs
|sort _time # why do you need the sort? Logs are already sorted _time descending by default
| bin _time span = 16m
| stats count by host, _time # added a comma for readability
|where count > 7 # shouldn't this be 7? you'd want all 8 2 minute chunks to be above the threshold
|eval count = count *2 # why do you need this line?
There are some other ways to do this (grabbing the earliest time of exceeded value, latest time, taking the diff). I would also urge you to get comfortable testing your alerts, in this case by lowering the threshold and seeing if, for example, a threshold of 0 returns the complete result set of all the hosts you would expect to see.
Hope this helps!
|sort _time # this to make it easier for the application team to read the logs when they open the alert so that all the events are in ascending order.
|bin _time span = 16m
| stats count by host _time
|Where count > 7 # Yeah Should be 7
|Eval count = count *2 # only to display the number of minutes the value was above the threshold
| rename count AS "Minutes Over Threshold" host as Host
Thank you for the help.
How would i go about grabbing the earliest time of exceeded Value and the latest time for the exceeded value and taking the difference.
thank you