Alerting

calculate avg value over time - alert if 200% increase

sonicZ
Contributor

Hi,

I am trying to track a value on a backend server if a certain operation spikes to greater then 200% of the average value per 5 minutes, not sure how to do the alert part unless i enter a static value like this, and alert on the eval "high" value.

index="vip" host=ship*be* OR host=van*be* OPERATION="Validate" source="/app/logs/vipservices/vipservices.log" earliest=-5m | timechart span=5m count by host | eval BE_spike = if( count > 2000, "high", "normal")

what's the best way to schedule an alert if the OPERATION=Validate avg spikes higher then 200% of the previous values over time?

Tags (2)
1 Solution

lguinn2
Legend

Try this:

index="vip" host=ship*be* OR host=van*be* OPERATION="Validate" 
source="/app/logs/vipservices/vipservices.log" earliest=-5m 
| stats count as Last5Minutes by host
| join host [ search index="vip" host=ship*be* OR host=van*be* OPERATION="Validate" 
    source="/app/logs/vipservices/vipservices.log" earliest=-30d latest=-5m
    | bucket span=5m _time
    | stats count by host 
    | stats avg(count) as Average by host ]
| where Last5Minutes > Average
| table host Last5Minutes Average

And set the alert to trigger when the number of results is greater than zero.

Test it by removing the where command. Also, I updated this after I realized that the original (using timechart) wasn't working properly.

View solution in original post

dwaddle
SplunkTrust
SplunkTrust

There is video from a presentation by Jesse Trucks at a recent Splunk Live which covers just about this exact same topic. Watch it at https://vimeo.com/66779015

lguinn2
Legend

Try this:

index="vip" host=ship*be* OR host=van*be* OPERATION="Validate" 
source="/app/logs/vipservices/vipservices.log" earliest=-5m 
| stats count as Last5Minutes by host
| join host [ search index="vip" host=ship*be* OR host=van*be* OPERATION="Validate" 
    source="/app/logs/vipservices/vipservices.log" earliest=-30d latest=-5m
    | bucket span=5m _time
    | stats count by host 
    | stats avg(count) as Average by host ]
| where Last5Minutes > Average
| table host Last5Minutes Average

And set the alert to trigger when the number of results is greater than zero.

Test it by removing the where command. Also, I updated this after I realized that the original (using timechart) wasn't working properly.

sonicZ
Contributor

I have another request on this answer, what if i want to do the same query but compare the last 5 minutes vs the last 12 / 24 hours? I am messing around with spans and dividing the avg(count) ..Math is hard 🙂

0 Karma

sonicZ
Contributor

Awesome this works, thanks again

0 Karma

lguinn2
Legend

all I did was use eval to create a new variable called orig_host in the first search. You could also use rename

0 Karma

lguinn2
Legend

index="vip" host=ship*be* OR host=van*be* OPERATION="Validate"
source="/app/logs/vipservices/vipservices.log" earliest=-5m
| stats count as Last5Minutes by host
| eval orig_host = host
| join orig_host
[ search index=summary_vip orig_host=ship*be* OR orig_host=van*be* OP="Validate"
source="VIP Operations by Host Summary Index Search 5 Min" earliest=-15m latest=-5m
| bucket span=5m _time
| stats count by orig_host
| stats avg(count) as Average by orig_host ]
| eval doubleAVG=(2*Average)
| where Last5Minutes > doubleAVG
| table orig_host Last5Minutes Average doubleAVG

0 Karma

sonicZ
Contributor
index="vip" host=ship*be* OR host=van*be* OPERATION="Validate" 
source="/app/logs/vipservices/vipservices.log" earliest=-5m 
| stats count as Last5Minutes by host
| join host, orig_host 
[ search index=summary_vip  orig_host=ship*be* OR orig_host=van*be* OP="Validate" 
source="VIP Operations by Host Summary Index Search 5 Min" earliest=-15m latest=-5m 
| bucket span=5m _time  
| stats count by orig_host  
| stats avg(count) as Average by orig_host ] 
| eval doubleAVG=(2*Average) 
| where Last5Minutes > doubleAVG
| table orig_host Last5Minutes Average doubleAVG
0 Karma

sonicZ
Contributor

Hi Lisa,
So based on your answer i think i am getting close...i already have a previous saved search gathering some validate operations in a summary index.
The problem is i cant do a join on orig_host to host because the summary index stores hosts as orig_host and comparing to regular vip index uses host, know any workarounds?

0 Karma

sonicZ
Contributor

Lisa i tried using the subsearch you posted above, lowering the
"earliest=-1h" returns average values on the 8 hosts are around 10k average per host
"earliest=-4h" returns average values on the 8 hosts are around 40-70k average per host
"earliest=-6h" returns average values on the 8 hosts are around 70-90k average per host

The earliest =-30d would take way too long to finish.
If i just want to run it every 5 minutes should i make earliest=-10m latest=-5m how would i make the "where Last5Minutes > Average" only alert if 200% of average is reached?

0 Karma

sonicZ
Contributor

Lisa, this definitely returns results, 8k or so within -5m

0 Karma

lguinn2
Legend

What does this return

index="vip" host=ship*be* OR host=van*be* OPERATION="Validate"
source="/app/logs/vipservices/vipservices.log" earliest=-5m

and note that I have updated my answer above!

0 Karma

sonicZ
Contributor

Lisa, thanks could not get the above search to return results, i tried lowering the earliest to earliest=-1h but still not getting results with even the subsearch.

I'll try again monday.

0 Karma

sonicZ
Contributor
index="vip" host=ship*be* OR host=van*be* operation=Validate source="/app/logs/vipservices/vipservices.log" | timechart count span=1m | streamstats window=20 avg(count) as avgCount | fields _time avgCount

Or

index="vip" host=ship*be* OR host=van*be* operation=Validate source="/app/logs/vipservices/vipservices.log" | timechart span=1m avg(count) as avgcount |  bucket _time span=1m
| stats count by _time
| stats avg(count) as AverageCount | streamstats avg(AverageCount) as Strm_AverageCount

Getting the averages, but failing to compare to previous values over time.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...