Splunk Search

What's a faster search than a Nested IF statement?

skoelpin
SplunkTrust
SplunkTrust

I wrote this search to look at a user agent string (RTG_Browser) and identify the operating system. I plan on writing another search to identify the browser from the user agent string. I then plan to append these columns to my current query. A regular expression would not work here as it has to be renamed from Windows NT 6.2 to Windows 8.

index=access | eval OS = if(match(RTG_Browser,"Windows NT 6.1"), "Windows 7", if (match(RTG_Browser,"Windows NT 6.3") OR match(RTG_Browser,"Windows NT 6.2"), "Windows 8", if (match(RTG_Browser,"Macintosh"), "OS X", if(match(RTG_Browser,"like Mac OS X") OR match(RTG_Browser,"like mac o sx"),"m.iOS", if(match(RTG_Browser,"Android"), "m.Android", if(match(RTG_Browser,"Windows NT 5.1") OR match(RTG_Browser,"Windows NT 5.2"),"Windows XP", if(match(RTG_Browser, "bingbot"), "Bing Bot", if(match(RTG_Browser,"Windows Phone"),"m.Windows", if(match(RTG_Browser, "Windows NT 6.0"),"Windows Vista", if(match(RTG_Browser,"Windows NT 10.0"),"Windows 10",if(match(RTG_Browser, "X11") OR match(RTG_Browser, "Linux"),"Linux",if(match(RTG_Browser,"Googlebot"),"Google Bot","OTHER"))))))))))))
0 Karma
1 Solution

woodcock
Esteemed Legend

Instead of nesting your if statements, use a case statement like this (but it probably has the same performance even though it is cleaner and easier to read):

index=access| eval OS =case(
   (RTG_Browser="Windows NT 6.1"), "Windows 7", 
   ((RTG_Browser="Windows NT 6.3") OR (RTG_Browser="Windows NT 6.2")), "Windows 8", 
   (match(RTG_Browser,"Macintosh")), "OS X", 
   (match(RTG_Browser,"like Mac OS X") OR match(RTG_Browser,"like mac o sx")),"m.iOS", 
   (match(RTG_Browser,"Android")), "m.Android", 
   (match(RTG_Browser,"Windows NT 5.1") OR match(RTG_Browser,"Windows NT 5.2")),"Windows XP", 
   (match(RTG_Browser, "bingbot")), "Bing Bot", 
   (match(RTG_Browser,"Windows Phone")),"m.Windows", 
   (match(RTG_Browser, "Windows NT 6.0")),"Windows Vista", 
   (match(RTG_Browser,"Windows NT 10.0")),"Windows 10", 
   (match(RTG_Browser, "X11") OR match(RTG_Browser, "Linux")),"Linux", 
   (match(RTG_Browser,"Googlebot")),"Google Bot",
   (1==1), "OTHER")

View solution in original post

woodcock
Esteemed Legend

Instead of nesting your if statements, use a case statement like this (but it probably has the same performance even though it is cleaner and easier to read):

index=access| eval OS =case(
   (RTG_Browser="Windows NT 6.1"), "Windows 7", 
   ((RTG_Browser="Windows NT 6.3") OR (RTG_Browser="Windows NT 6.2")), "Windows 8", 
   (match(RTG_Browser,"Macintosh")), "OS X", 
   (match(RTG_Browser,"like Mac OS X") OR match(RTG_Browser,"like mac o sx")),"m.iOS", 
   (match(RTG_Browser,"Android")), "m.Android", 
   (match(RTG_Browser,"Windows NT 5.1") OR match(RTG_Browser,"Windows NT 5.2")),"Windows XP", 
   (match(RTG_Browser, "bingbot")), "Bing Bot", 
   (match(RTG_Browser,"Windows Phone")),"m.Windows", 
   (match(RTG_Browser, "Windows NT 6.0")),"Windows Vista", 
   (match(RTG_Browser,"Windows NT 10.0")),"Windows 10", 
   (match(RTG_Browser, "X11") OR match(RTG_Browser, "Linux")),"Linux", 
   (match(RTG_Browser,"Googlebot")),"Google Bot",
   (1==1), "OTHER")

skoelpin
SplunkTrust
SplunkTrust

Would a case statement be faster then a lookup?

0 Karma

woodcock
Esteemed Legend

A lookup would probably be faster but it depends so the only sure way to know is to do it both ways and then examine the job inspector to see which one was faster on various test searches.

mreynov_splunk
Splunk Employee
Splunk Employee

This seems ripe for a lookup.

mreynov_splunk
Splunk Employee
Splunk Employee

io operation of opening a file should be miliseconds and it happens once per session.

skoelpin
SplunkTrust
SplunkTrust

Do you suggest I put this in a CSV file then do a lookup? Would the search performance be much faster with a lookup compared to a nested-if?

0 Karma

mreynov_splunk
Splunk Employee
Splunk Employee

yes, I suggest a lookup even though there is a price to pay for opening the file, but once that's done, it will work like a case statement, so the computation should be fast.

skoelpin
SplunkTrust
SplunkTrust

Can you explain the performance cost with opening a file? Would it be a one time thing on the first search or would it be every time? What's the performance hit? Are we talking seconds or minutes?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...