Here's an example snippet of the logs I'm working with:
2018-04-17 18:26:02 app=test-app, env=qa, total_msg=0
2018-04-17 18:25:02 app=test-app, env=qa, total_msg=60
2018-04-17 18:24:02 app=test-app, env=qa, total_msg=0
2018-04-17 18:23:02 app=test-app, env=qa, total_msg=100
2018-04-17 18:22:02 app=test-app, env=qa, total_msg=50
I'd like to create alerts and dashboard around these metrics. I've been attempting to use delta, but it's returning a negative number for my 'msg_proc' value. The query I'm using is:
index=myindex sourcetype=mymetrics environment="qa" app=test-app
| bucket _time span=1m
| stats sum(total_msg) as current by _time, app
| delta current
| rename delta(current) as msg_proc
The above query results in the output below:
_time,app,current,msg_proc
2018-04-17T14:59:00,test-app,0,0
2018-04-17T15:00:00,test-app,0,0
2018-04-17T15:01:00,test-app,42,0
2018-04-17T15:02:00,test-app,27,-15
2018-04-17T15:03:00,test-app,35,8
2018-04-17T15:04:00,test-app,21,-14
2018-04-17T15:05:00,test-app,3,-18
2018-04-17T15:06:00,test-app,1,-2
2018-04-17T15:07:00,test-app,1,0
2018-04-17T15:08:00,test-app,1,0
2018-04-17T15:09:00,test-app,1,0
As expected the delta is showing a negative number (since it's 'processing' that many messages from previous number). I'd like to send an alert when current is >0, and msgs are not being consumed within a 5 minute period (or alert if it's slow). I struggling to define good logic to alert for this condition. I'd like to create a dashboard for specific queues to show the msg_proc rate for each queue if possible. Hope this makes sense!
... View more