Hi, every night my server team brings down specific groups of servers and performs maintenance on them. Sometime later that night the servers will be brought back up and I will see the respective interface change in my syslogs.
I created a real-time search that will alert me every time an interface is brought down and then back up. Is it possible, perhaps using the Custom condition search, to notify me if an interface is not brought back up after X amount of hours?
Although rare, the server team on occasion forgets to bring a server back up and my team ends up troubleshooting the problem the next morning. It would be ideal if I was told that was brought down never came back up after say 4-5 hours.
I currently have two separate real-time searches, one looking for the interface state change to down and the other to up - perhaps I can do all this with a single search?
Any thoughts?
Thanks!
Here is my example - it's pretty limited as you can see, because I don't know much about your environment. I am assuming that you have the following
interfaceID
that uniquely identifies the interfacea field named interfaceStatus
that captures the state of the interface - up
or down
interfaceStatus=up OR interfaceStatus=down earliest=-8h
| stats latest(interfaceStatus) as currentStatus latest(_time) as timeOfStatus by interfaceID
| where currentStatus="down" AND timeOfStatus < relative_time(now(), "-4h")
| fieldFormat timeOfStatus=strftime(timeOfStatus,"%x %X")
Set this search to run once per hour, and to alert you if the number of results > 0.
I think this would be relatively easy to do as a single search - and it probably doesn't need to be a real-time search, either. However, I don't really have enough info to show an example search - perhaps you could show the two searches? Anonymized as necessary.