How do I create and trigger an alert in real-time ...

asnamat · ‎04-15-2015

I am looking to create an alert which would trigger in real-time if an event from esxi device is triggered for lost redundancy to storage and not followed by a restored event within an hour. In the below example, storage redundancy was lost for RTP1-VIF19-7072-LUN005 data store, but was restored after five minutes. I want Splunk to generate an alert if the redundancy is not restored within an hour for any device.

Example:

problem event:

2015-04-15T15:30:53+00:00 rtp1-vif064-17  [scsiCorrelator] 2403930999128us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device naa.60001440000000107072444926f563d3 degraded. Path vmhba2:C0:T2:L2 is down. Affected datastores: "RTP1-VIF19-7072-LUN005".

Restore event:

2015-04-15T15:35:53+00:00 rtp1-vif064-17 [411C1B70 info 'Vimsvc.ha-eventmgr'] Event 1294 : Path redundancy to storage device naa.60001440000000107072444926f563d3 (Datastores: RTP1-VIF19-7072-LUN005) restored. Path vmhba2:C0:T2:L2 is active again.

lguinn2 · ‎04-15-2015

"esx.problem.storage.redundancy.degraded" OR ("Path redundancy to storage device" AND "restored") earliest=-62m
| rex "^\s*\S+\s(?<datastore>\S+)\s"
| transaction datastore startswith="esx.problem.storage.redundancy.degraded" 
                        endswith="Path redundancy to storage device" keepevicted=1
| search "esx.problem.storage.redundancy.degraded"
| where duration>3600 OR eventcount=1
| eval problemStartTime = _time
| eval problemDuration = if(eventcount==1,now()-problemStartTime,duration)
| where problemDuration > 3600
| eval problemStartTime = strftime(problemStartTime,"%x %X")
| eval problemDuration = tostring(problemDuration,"duration")
| table datastore problemStartTime problemDuration

Schedule the alert to run once each minute and trigger your alert on "number of results greater than zero."

Also, you might be able to add more to the initial search to make it more efficient.

How do I create and trigger an alert in real-time if EventA is not followed by EventB within an hour ?

Updated Team Landing Page in Splunk Observability

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...