Alerting

How can I set an alert action that disables alerts that fire more than x-number of times?

rgritt
Engager

Hello,

I'm currently trying to have an alert action that disables alerts that fire more than x number of times. For example, if an alert called 'Test Alert' fires 100 times in 24 hours I want that alert to be disabled.

Right now I see 2 key ways to that, either with a custom command or with a custom alert action.

Does anyone have any insight as to how this might be accomplished?

0 Karma

woodcock
Esteemed Legend

Don't even bother with the complication of alert actions at all, just call the REST API from within the SPL of the saved search like this:

Your stuff to mine alert details using either <index="_audit" action="alert_fired> OR using <index="_internal" sourcetype="scheduler" thread_id="AlertNotifier*" alert_actions!="summary_index">
| map [|rest/servicesNS/$owner$/$app$/saved/searches/$name$ -d "is_scheduled=0"]
0 Karma

raptraj1
Observer

Hello Woodcock

Does this REST command to disable ever work before? don't know it's giving me this error.

raptraj1_0-1640592509260.png

 

Thanks

 

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

There are various ways to throttle an alert. Sounds like you may want something more flexible than the vanilla options. You can code anything you would like in this structure...

your current alert language 
| appendpipe [ some test that puts out field "delaytime" or nothing]
| eventstats max(delaytime) as delaytime
| where isnull(delaytime) 

Depending on run frequency and preferences, the last line can also be ...

| where isnull(delaytime) OR (_time > delaytime)

... but in that last case, you need to set |eval _time = delaytime on the output record so that it disappears itself.


...for example...

your current alert language that can produce any number of records (we do not care)
| appendpipe
    [| stats count as mycount min(_time) as mintime
     | eval mycount=if(mycount>0,1,0)
     | eval mintime=coalesce(mintime,now())

     | rename COMMENT as "incoming record (if any) has prior mycount and delaytime. Null works fine."  
     | rename COMMENT as "mycount>100 and delaytime>0 indicates throttling has been tripped"
     | rename COMMENT as "mycount<100 or null and delaytime null indicates throttling has NOT been tripped"
     | inputcsv append=t myalertrecord.csv 
     | stats sum(mycount) as mycount, min(delaytime) as delaytime, min(mintime) as mintime  

     | rename COMMENT as "If delaytime is past, then throttling is ended-- set count to 1, delaytime to null"
     | eval mycount=if(delaytime<mintime,1,mycount)
     | eval delaytime=if(delaytime<mintime,null(),delaytime)

     | rename COMMENT as "If count >=100, then throttling is either in place or needs to start."  
     | rename COMMENT as "Set delaytime - any value received we pass on, otherwise we set for 24 hours from mintime"
     | eval delaytime=case(count>=100,coalesce(delaytime,mintime+86400))
     | outputcsv myalertrecord.csv 
     | where isnotnull(delaytime) 
     | eval _time = delaytime 
    ]

| eventstats max(delaytime) as delaytime
| where isnull(delaytime) OR (_time > delaytime)
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...