Reporting

How to Report System Uptime/Downtime?

sjnorman
Explorer

I'm trying to create a system uptime/downtime report using under the following conditions:

  • Server starts up, logs a message: "server X starting up"
  • Some critical transactions start to fail but the server remains up. For the purposes of reporting the server is considered down. Log entries start appearing, such as: "transaction Y failed"
  • At some point system functionality has to be restored by either restarting the server or by terminating the process. As such, we can't rely on a graceful shutdown message appearing in the logs.

So, the downtime can be measured as the difference between the first and last occurrence of "transaction Y failed" between "server X starting up" messages.

I'm looking for suggestions as to how I'd go about creating this report...I can determine all of the information via manual searches but I'd rather automate the process.

Tags (2)
0 Karma

somesoni2
SplunkTrust
SplunkTrust

Try this

index=yourindex sourcetype=yoursourcetype "server X starting up" OR "transaction * failed" 
|rex <<field extraction for message, if not already extracted>> | sort 0 _time | eval type=if(like(message,"server % starting up"),"Up","Down")| streamstats current=f window=1 first(type) as prevType | eval include=if(type=prevType,"N","Y") | where include="Y" 

This should give you just the logs with "server X starting up" and first log with "transaction Y failed". After that you can use transaction command to calculate duration which will be your downtime.

somesoni2
SplunkTrust
SplunkTrust

There was a syntax error in like command. Updated the same.

0 Karma

sjnorman
Explorer

I'm getting an error with the like clause...it appears to only want 2 parameters:

like(message,"server % starting up","Up","Down")

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...