Splunk Search

How to remove events from search results for known time ranges to differentiate between infrastructure outage and software usage timeouts?

drmed
Explorer

We occasionally have infrastructure outages that result in a higher number of timeouts during the outage period. Would like to differentiate between these and everyday software usage timeouts. If there a way to remove these outages from search results if we know the date time range? There doesn't appear to be a way to search on multiple date time ranges? Maybe we could combine search results with a UNION?

Tags (3)
1 Solution

lguinn2
Legend

Here is one idea. I would consider using a lookup table, where the lookup table had a format like this

outageStart,outageEnd
1410812296 ,1410826714

with one line for each infrastructure outage. I did this in Linux epoch time, since that is easier for Splunk. You could use a text time format, but you would have to convert it. I called the lookup outage-lookup in the example below.

Once you have this table, you have to incorporate it into your search:

| yoursearchhere
| eval eventTime = _time
| append [ | inputlookup outage-lookup | addinfo 
          | where outageStart >=info_min_time AND outageStart <= info_max_time
           | eval eventTime = outageStart ]
| sort eventTime
| streamstats current=f window=1 last(outageStatus) as outageStatus last(outageEndTime) as outageEndTime
| eval outageStatus=if(eventTime >= outageEndTime,0,1)
| eval outageStatus=if(isnotnull(outageStart),1,0)
| eval outageEndTime=if(isnotnull(outageEnd),outageEnd,outageEndTime)
| where outageStatus=0

I am trying to think of an easier way to do this, but this is all I have right now. You could also consider using a macro to implement this...

View solution in original post

lguinn2
Legend

Here is one idea. I would consider using a lookup table, where the lookup table had a format like this

outageStart,outageEnd
1410812296 ,1410826714

with one line for each infrastructure outage. I did this in Linux epoch time, since that is easier for Splunk. You could use a text time format, but you would have to convert it. I called the lookup outage-lookup in the example below.

Once you have this table, you have to incorporate it into your search:

| yoursearchhere
| eval eventTime = _time
| append [ | inputlookup outage-lookup | addinfo 
          | where outageStart >=info_min_time AND outageStart <= info_max_time
           | eval eventTime = outageStart ]
| sort eventTime
| streamstats current=f window=1 last(outageStatus) as outageStatus last(outageEndTime) as outageEndTime
| eval outageStatus=if(eventTime >= outageEndTime,0,1)
| eval outageStatus=if(isnotnull(outageStart),1,0)
| eval outageEndTime=if(isnotnull(outageEnd),outageEnd,outageEndTime)
| where outageStatus=0

I am trying to think of an easier way to do this, but this is all I have right now. You could also consider using a macro to implement this...

drmed
Explorer

Thanks, will give this a try!

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...