Splunk Search

Multiple uses of the same lookup table - how to simplify?

pm771
Communicator

I have a stream of events that have names and each name belongs to a certain category.

For this example, it will be two category: "24x7" and "custom".

There are 2 lookup tables: NoEventDates (aka Holiday table) and ZeroEvents.

ZeroEvents table has subset of all possible event names with additional parameters: 

CategoryeventHourFromHourToHolidaysOffDaysOfWeekOff
customEvent111112Y 
customEvent12  N0,6
customEvent13  N 
customEvent14  Y5.6
24X7Event21024  
24X7Event22024  
24X7Event23024  

  

"24X7" events are expected within every 15-min all day long without holidays or weekends.
Custom event can have days of year (holidays) and/or days of week (such as weekend) when no events are expected.
Every day a custom event is expected it would come for sure only during the specific time range. 

The task is to discover "missing" events situation as quickly as possible.

Custom events will be monitored every 15-min by sliding 2-hour window within their prescribed hours.

For "24X7" I have the following query:

index=...

[| inputlookup ZeroEvents.csv | where Category="24X7" | fields event | format]
| stats count as eventscount by event
| append [| inputlookup ZeroEvents.csv | search DeliveryMethod="24X7" | fields event | eval eventscount=0 ]
| stats sum(eventscount) as total by event
| where total < 1
| stats count as number
| eval NetcoolTitle=number + " 24X7 events with no messages" 


I did not need to use Holiday table for that case.

For custom events it gets more complicated  and I'm stuck trying to find a way not to repeat all conditions twice.

Here's is the structure with one part of the "append query" hard-coded:

index=...
 
	[| inputlookup ZeroEvents.csv | where DeliveryMethod="Batch"  | fields event| format]
 | eval date=strftime(_time,"%Y-%m-%d")
 | lookup NoEventDates.csv NEDate as date OUTPUT NEDate as Holiday
 | eval Holiday=if(isnull(Holiday), "N", "Y")
 | eval DOW=strftime(_time,"%w") 
 | eval currentHour=strftime(now(), "%H")
 | lookup ZeroEvents.csv event OUTPUT HolidaysOff DaysOfWeek HourFrom HourTo
 | where NOT match(DaysOfWeek, DOW) AND (Holiday="N" OR HolidaysOff="N") AND currentHour >= HourFrom AND currentHour <= HourTo 
 | stats count as eventscount by topic
 | append [| inputlookup ZeroEvents.csv | eval DOW="0", Holiday="N", currentHour=1 | where DeliveryMethod="Batch" AND NOT match(DaysOfWeek, DOW) AND (Holiday="N" OR HolidaysOff="N") AND currentHour >= HourFrom AND currentHour <= HourTo | eval eventscount=0 | fields topic eventscount]
 | stats sum(eventscount) as total by events
 | where total < 1

 

As I mentioned, `eval DOW="0", Holiday="N", currentHour=1` should be either recalculated using the same logic or I need somehow to use variables from the outer scope.

Is there a simpler way to write such lookup-based queries?

Is there a solution without a massive code duplication for "custom" events?

Labels (3)
0 Karma

pm771
Communicator

[Answering my own question.  Any ideas for improvement are welcome]

 index=...
 
	[| inputlookup ZeroEvents.csv | where Category="custom"  | fields event| format]
 | stats count as eventscount by event	
 | append [
            | inputlookup ZeroEvents.csv  
			| eval currentdate=strftime(now(),"%Y-%m-%d") 
			| lookup NoEventDates.csv NEDate as date OUTPUT NEDate as Holiday
			| eval Holiday=if(isnull(Holiday), "N", "Y"), DOW=strftime(now(), "%w"), currentHour=strftime(now(), "%H")
			| where Category="custom" AND NOT match(DaysOfWeek, DOW) AND (Holiday="N" OR HolidaysOff="N") AND currentHour >= HourFrom AND currentHour <= HourTo 
            | fields event
            | eval eventscount=0 			
          ]
 | stats sum(eventscount) as total by events
 | where total < 1

 

The main change is that almost all filtering is done on lookup table itself.

Advantages:

  1. DRY principle is obeyed (almost)
  2. Main lookup logic is isolated and can be debugged on its own

In order to test it's necessary to replace 4 variables with desired hard-coded values.

Due to late filtering the query will perform extra work, but its performance is still acceptable.

 

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...