Splunk Search

Service Downtime Duration

tmarlette
Motivator

I am attempting to find the duration of each downtime instance that has occurred in the last 24 hours, and I am attempting to use the transaction command to do so. I am currently using WMI to query service state, and I'm looking to visualize when the 'State' field changes from "Running" to "Down" and then the duration between the first "Down" State message, and the next "Running" state message.

I'm looking for the results to be in a table that looks kind of like this:
DT1 (time of first down message),DT2 (time of next "Running" message), ,host

Sourcetype=<mysourcetype> Name=<servicename> | transaction State maxpause=10 | timechart max(duration) by Name,host

I don't know if this is the best way to go about this, because my query doesn't seem to be returning the data i'm looking for.

Any help would be greatly appreciated!

0 Karma
1 Solution

linu1988
Champion

Why don't we keep it simple?

Sourcetype=<mysourcetype> | transaction Name startswith="State=Stopped" endswith="State=Running" | stats sum(duration) as "Total Downtime in Seconds" by Name,host

Thanks

View solution in original post

0 Karma

linu1988
Champion

Why don't we keep it simple?

Sourcetype=<mysourcetype> | transaction Name startswith="State=Stopped" endswith="State=Running" | stats sum(duration) as "Total Downtime in Seconds" by Name,host

Thanks

0 Karma

tmarlette
Motivator

This works beautifully. I am just piping this into an eval statement to get the % of downtime per month now. thank you sir!

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Try something like this

sourcetype="servicestatus" | streamstats  window=1 current=f last(State) as Prev  by Name| where NOT State=Prev | transaction Name startswith=State=Down endswith=State=Running | eval UpTime =_time+duration |convert ctime(_time) as DownTime ctime(UpTime) as UpTime | table DownTime, UpTime, Name, host
0 Karma

lguinn2
Legend

Can you give more detail about how you know whether the service is "up" or "down"?

0 Karma

lguinn2
Legend

What about this

sourcetype=mysourcetype 
| eval state = if(State=="Down",0,1)
| sort _time
| xyseries _time servicename state

Then look it as a visualization. This will do all of the servicenames at once, but you could select just a few in the search command.

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...