Splunk Search

How to remove events from search results for known time ranges to differentiate between infrastructure outage and software usage timeouts?

drmed
Explorer

We occasionally have infrastructure outages that result in a higher number of timeouts during the outage period. Would like to differentiate between these and everyday software usage timeouts. If there a way to remove these outages from search results if we know the date time range? There doesn't appear to be a way to search on multiple date time ranges? Maybe we could combine search results with a UNION?

Tags (3)
1 Solution

lguinn2
Legend

Here is one idea. I would consider using a lookup table, where the lookup table had a format like this

outageStart,outageEnd
1410812296 ,1410826714

with one line for each infrastructure outage. I did this in Linux epoch time, since that is easier for Splunk. You could use a text time format, but you would have to convert it. I called the lookup outage-lookup in the example below.

Once you have this table, you have to incorporate it into your search:

| yoursearchhere
| eval eventTime = _time
| append [ | inputlookup outage-lookup | addinfo 
          | where outageStart >=info_min_time AND outageStart <= info_max_time
           | eval eventTime = outageStart ]
| sort eventTime
| streamstats current=f window=1 last(outageStatus) as outageStatus last(outageEndTime) as outageEndTime
| eval outageStatus=if(eventTime >= outageEndTime,0,1)
| eval outageStatus=if(isnotnull(outageStart),1,0)
| eval outageEndTime=if(isnotnull(outageEnd),outageEnd,outageEndTime)
| where outageStatus=0

I am trying to think of an easier way to do this, but this is all I have right now. You could also consider using a macro to implement this...

View solution in original post

lguinn2
Legend

Here is one idea. I would consider using a lookup table, where the lookup table had a format like this

outageStart,outageEnd
1410812296 ,1410826714

with one line for each infrastructure outage. I did this in Linux epoch time, since that is easier for Splunk. You could use a text time format, but you would have to convert it. I called the lookup outage-lookup in the example below.

Once you have this table, you have to incorporate it into your search:

| yoursearchhere
| eval eventTime = _time
| append [ | inputlookup outage-lookup | addinfo 
          | where outageStart >=info_min_time AND outageStart <= info_max_time
           | eval eventTime = outageStart ]
| sort eventTime
| streamstats current=f window=1 last(outageStatus) as outageStatus last(outageEndTime) as outageEndTime
| eval outageStatus=if(eventTime >= outageEndTime,0,1)
| eval outageStatus=if(isnotnull(outageStart),1,0)
| eval outageEndTime=if(isnotnull(outageEnd),outageEnd,outageEndTime)
| where outageStatus=0

I am trying to think of an easier way to do this, but this is all I have right now. You could also consider using a macro to implement this...

drmed
Explorer

Thanks, will give this a try!

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...