Splunk Search

How to edit my search to find if a service is down, then trigger an alert only if the status is still down 1 minute later?

splunker9999
Path Finder

Hi ,

We have search that runs for every minute, and if in case it found any Service is down, it triggers an alert. However, we are thinking to enhance the search in a way that search should run for every 1 min interval, and if it finds a service down, it should not trigger an alert. Instead, it should wait for one more minute and if it still gets the status as "Service Down", then it should trigger.

Below is our basic search:

index =index1 sourcetype=WMI Caption="*" host=WGP 
| stats latest(State) AS State by _time host Name
| rename Name as Service
| search State=Stopped
| eval currentTime=now() 
0 Karma
1 Solution

sundareshr
Legend

Try this. Run this every two mins and trigger if count >= 2

 index =index1 sourcetype=WMI Caption="*" host=WGP earliest=-2m@m State="Stopped"
| bin span=2m _time
| stats stats count by _time host Name
| rename Name as Service
| eval currentTime=now() 

View solution in original post

0 Karma

sundareshr
Legend

Try this. Run this every two mins and trigger if count >= 2

 index =index1 sourcetype=WMI Caption="*" host=WGP earliest=-2m@m State="Stopped"
| bin span=2m _time
| stats stats count by _time host Name
| rename Name as Service
| eval currentTime=now() 
0 Karma

Raschko
Communicator

The bin span should be 1min correct?

0 Karma

splunker9999
Path Finder

Hi,

Is bin span to be 1m?

Thanks

0 Karma

sundareshr
Legend

You want it to be 2m. If you have this scenario twice in quick succession, you want the alert, right?

0 Karma

splunker9999
Path Finder

Yes, but if we give this condition and ran this search ,and found now results which is correct, but when scheduled this alert some how we are getting alerts with out satisfying conditions?

Will this condition will work, I have just added where count>=2

 index =index1 sourcetype=WMI Caption="*" host=WGP earliest=-2m@m State="Stopped"
 | bin span=2m _time
 | stats stats count by _time host Name|where count>=2
 | rename Name as Service
 | eval currentTime=now()

This is because, we have script running ,which will raise automatic service now ticket if it finds any results, since we have specified trigger condition count>=2 ,we are not getting alert but it is raising automatic Service now ticket ,so I thought of changing this condition in Query itself?

0 Karma

sundareshr
Legend

Yes, it should work

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...