Splunk Search

How to edit my search to find if a service is down, then trigger an alert only if the status is still down 1 minute later?

splunker9999
Path Finder

Hi ,

We have search that runs for every minute, and if in case it found any Service is down, it triggers an alert. However, we are thinking to enhance the search in a way that search should run for every 1 min interval, and if it finds a service down, it should not trigger an alert. Instead, it should wait for one more minute and if it still gets the status as "Service Down", then it should trigger.

Below is our basic search:

index =index1 sourcetype=WMI Caption="*" host=WGP 
| stats latest(State) AS State by _time host Name
| rename Name as Service
| search State=Stopped
| eval currentTime=now() 
0 Karma
1 Solution

sundareshr
Legend

Try this. Run this every two mins and trigger if count >= 2

 index =index1 sourcetype=WMI Caption="*" host=WGP earliest=-2m@m State="Stopped"
| bin span=2m _time
| stats stats count by _time host Name
| rename Name as Service
| eval currentTime=now() 

View solution in original post

0 Karma

sundareshr
Legend

Try this. Run this every two mins and trigger if count >= 2

 index =index1 sourcetype=WMI Caption="*" host=WGP earliest=-2m@m State="Stopped"
| bin span=2m _time
| stats stats count by _time host Name
| rename Name as Service
| eval currentTime=now() 
0 Karma

Raschko
Communicator

The bin span should be 1min correct?

0 Karma

splunker9999
Path Finder

Hi,

Is bin span to be 1m?

Thanks

0 Karma

sundareshr
Legend

You want it to be 2m. If you have this scenario twice in quick succession, you want the alert, right?

0 Karma

splunker9999
Path Finder

Yes, but if we give this condition and ran this search ,and found now results which is correct, but when scheduled this alert some how we are getting alerts with out satisfying conditions?

Will this condition will work, I have just added where count>=2

 index =index1 sourcetype=WMI Caption="*" host=WGP earliest=-2m@m State="Stopped"
 | bin span=2m _time
 | stats stats count by _time host Name|where count>=2
 | rename Name as Service
 | eval currentTime=now()

This is because, we have script running ,which will raise automatic service now ticket if it finds any results, since we have specified trigger condition count>=2 ,we are not getting alert but it is raising automatic Service now ticket ,so I thought of changing this condition in Query itself?

0 Karma

sundareshr
Legend

Yes, it should work

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...