Hi All,
I need some help to set up alerting which would be based upon the standard deviation of the occurrence of Errors.
I have this crude query as of now:
index= earliest=-30d NOT(date_wday=saturday) NOT(date_wday=sunday) | timechart span=7d count by tms_logcat limit=40
And i wish to setup alerts such that an alert is thrown for the below highlighted(in quotes) errors:
_time ERR-05257 ERR-05258 ERR-05259 ERR-05266 ERR-05275 ERR-05276 ERR-05279 ERR-05354 ERR-07507 ERR-07801
8/25/2014 0 1 9 386 12438 361 341314 1 0 2 11
9/1/2014 645 0 "5950" 1251 13959 9 434539 2 0 0 6
9/8/2014 722 0 0 "20370" 15014 4 356794 1 10 0 0
9/15/2014 0 0 0 627 12237 241 406969 2 0 1 8
9/22/2014 755 1 0 288 5204 64 "3994" 0 0 0 0
(Apologies - I do not seem to have enough karma points to attach the screenshot of the error data given above.)
Of course this is going to sound like a shameless plug, but honestly, the easiest way to do this is with the Prelert Anomaly Detective app.
Using the QuickMode feature, you can literally put this search in:
index=whatever NOT(date_wday=saturday) NOT(date_wday=sunday) | timechart count by tms_logcat limit=0
and Anomaly Detective will automatically take care of baselining the normal occurrence rate of each error type ("tms_logcat") and will offer you the ability to alert on this data on-going with a one-click ability to schedule the search to run in the background every 5 minutes for example. How it works video: http://support.prelert.com/customer/portal/articles/1417340-quickmode
By the way, don't get caught up in trying to use standard deviation as your approach to express anomalousness. Standard deviation assumes that the data samples (in this case, "counts of errors") conforms to a nice, symmetrical Gaussian Bell curve. In most cases, counts of things are better modeled by Poisson curves. Anomaly Detective automatically figures out the best statistical model for your data to maximize accuracy and minimize false alerting.