All Apps and Add-ons

Difference between WHERE and SEARCH commands

tsunamii
Path Finder

What are the differences between “where” and “search”? I read somewhere that "search" tends to cause more overhead. The search below if run over one day of netflow data, it takes more than 24+ hours to run.


index=proxy* s_op=GET | lookup geoip clientip as d_ip | where client_country="Russian Federation" OR client_country="Ukraine" OR client_country="Romania" OR client_country="Bulgaria" OR client_country="Latvia" OR client_country="Azerbaijan" OR client_country="Kazakstan" OR client_country="Macedonia" OR client_country="Serbia" | table _time c_ip d_ip r_host client_country client_city cs_bytes d_port cs_uri referer c_agent

1 Solution

martin_mueller
SplunkTrust
SplunkTrust

I'm going to guess that search takes that long because it's reading a boatload of events off disk and performing the lookup, only to then possibly throw out most of them. The where (or search) after that isn't going to add a lot more to the runtime of that pipeline.
What kind of lookup is that, scripted? How many events are you loading? What are you actually looking for as a result, could you possible pre-aggregate data before looking up the location? Have you considered using the Splunk 6 iplocation command to maybe speed up the lookup process?

As for the question from the title, search and where as a filter further down the pipeline mostly differ in what they can do, and how. where only evaluates boolean expressions, so to do a wildcard filter you have to explicitly call match() while search can just do field=value*. I doubt there's a significant difference in performance when doing the same stuff compared to the actual loading of events at the start of the pipeline.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

I'm going to guess that search takes that long because it's reading a boatload of events off disk and performing the lookup, only to then possibly throw out most of them. The where (or search) after that isn't going to add a lot more to the runtime of that pipeline.
What kind of lookup is that, scripted? How many events are you loading? What are you actually looking for as a result, could you possible pre-aggregate data before looking up the location? Have you considered using the Splunk 6 iplocation command to maybe speed up the lookup process?

As for the question from the title, search and where as a filter further down the pipeline mostly differ in what they can do, and how. where only evaluates boolean expressions, so to do a wildcard filter you have to explicitly call match() while search can just do field=value*. I doubt there's a significant difference in performance when doing the same stuff compared to the actual loading of events at the start of the pipeline.

Get Updates on the Splunk Community!

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...