Solved: My Search Is Faster the Second and "X" times I run...

MasterOogway · ‎01-13-2020

When I ran a search spanning an entire year it took 241 seconds. If I immediately rerun the search the time plummets to ~60 seconds. Why? Is this a Splunk or Disk optimization?

Background:
hot/warm sit on fast disk.
coldlib resides on not as fast, bigger disk.

Regardless of the search I run, when the data is polled the first time it's always a slower reply. When the I rerun the same exact search over the same exact disks the times drop considerably. Who's responsible? (who can I thank?) Splunk or Disks....and is it that easy, or is it more complex? I understand that searching back onto colddb disks will require a slower retrieval vs. warm/hotdb. The question is more of a lower level, backend one. But one I want to share with my user base when I advise them how to tune their searches and what will happen when they rerun the search.

I've looked through a lot of the Answers and on Splunk's site but can't really find the answer. This group is outstanding, so I'm leaning on you. Any insight is appreciated.

pstein

wmyersas · ‎01-13-2020

Splunk caches search results for a set period of time (configurable by the admin)

So if you run the exact same search while the data is still cached - the search will return results much faster the second time

But the search must truly be identical

Edit

It's not exactly for a set period of time, it's related to your user (and/or role) disk quota - https://docs.splunk.com/Documentation/Splunk/8.0.1/Admin/authorizeconf (though search results expiration time does factor in)

View solution in original post

martin_mueller · ‎01-14-2020

For ad-hoc searches, you need to thank the OS and its disk cache. There is no caching in Splunk that makes those reruns faster - and that's a good thing, otherwise you'd get old results.

For completeness' sake, there is a caching mechanism in dashboards, search docs.splunk.com/Documentation/Splunk/8.0.1/Viz/PanelreferenceforSimplifiedXML for "cache" to find out more.

wmyersas · ‎01-14-2020

I downvoted this post because your os isn't going to magically cache things - especially in a clustered environment

you user/role disk quota/cache is what's factoring-in here

martin_mueller · ‎01-14-2020

Repeated downvotes don't change facts... it's not magic either: https://en.wikipedia.org/wiki/Page_cache

wmyersas · ‎01-14-2020

You insisting on something that's irrelevant doesn't help your case

The OS doesn't magically cache extra data just because you wish it would

You can wish that all you want - doesn't change reality

martin_mueller · ‎01-14-2020

What extra data?

The first search reads a bucket off disk, the cache keeps those files in memory.
The second search reads the same bucket again, cache serves files from memory much faster.

MasterOogway · ‎01-14-2020

@martin_mueller
This also helps complete my question. Your link to the dashboard caching mechanism will help others as well. You too get an 'Accept', but I can only give out one. But in my heart you deserve one.

martin_mueller · ‎01-14-2020

For even more insight, post screenshots of the top section of the job inspector for the first slow run and another for the second fast run.

MasterOogway · ‎01-14-2020

Unfortunately, I'm unable to upload an image from my desktop to share.
It's ok. With the answers, I understand.

wmyersas · ‎01-14-2020

Your OS probably isn't caching that much - unless you happen to run identical searches frequently: ie ones with the same static time settings, or ones from summary indices

martin_mueller · ‎01-14-2020

The OS caches a ton of things: Your entire disks, if you have the memory for it.

wmyersas · ‎01-14-2020

I downvoted this post because your os won't cache the whole disk if it has the memory - oses utilize some memory cache, but they don't magically cache things that haven't been asked for (and won't keep around things that haven't been accessed in a long while)

wmyersas · ‎01-13-2020

Splunk caches search results for a set period of time (configurable by the admin)

So if you run the exact same search while the data is still cached - the search will return results much faster the second time

But the search must truly be identical

Edit

It's not exactly for a set period of time, it's related to your user (and/or role) disk quota - https://docs.splunk.com/Documentation/Splunk/8.0.1/Admin/authorizeconf (though search results expiration time does factor in)

ruman_splunk · ‎01-14-2020

I'm pretty sure this is not the case. You're probably seeing the effect of the OS caching tsidx files. Or, if you're on Splunk Cloud, you're seeing the effect of buckets in s3 getting localized to the search peers on the first search, so the second search doesn't need to copy buckets from s3.

ruman_splunk · ‎01-14-2020

I got clarification from engineering:

If your searches are EXACTLY the same. Including time range (meaning that if you do not have exact time range this is super rare.) Then it will reuse.
Basically if you run a search wait 1 second then run it again we do not reuse.
If you have normal time ranges from like latest=now earliest=-5m it will not reuse but if instead you were doing. earliest=-2d@d latest=-1d@d then it can end up reusing.

martin_mueller · ‎01-14-2020

My testing disagrees, when running index=_internal earliest=-w@d latest=@d | stats count by component twice (same search, same timerange, no time window creep happening) I still see significant time consumed. If reuse happened, it should finish in 0.little seconds.

MasterOogway · ‎01-14-2020

Using the exact same time range: earliest=1/14/2019:00:00:00 latest=01/13/2020:24:00:00
From your note, I would be seeing the effect of the OS caching tsidx files from the onPRem cluster.

MasterOogway · ‎01-14-2020

Thanks, @wmyersas
That's what I was looking to confirm. Appreciate your guidance and sharing the link for others to read up on.

martin_mueller · ‎01-14-2020

Can you point at docs for configuring that?

wmyersas · ‎01-14-2020

Check the disk-quota settings in Authorize.conf - https://docs.splunk.com/Documentation/Splunk/8.0.1/Admin/authorizeconf

martin_mueller · ‎01-14-2020

The disk quota stores search artifacts (= results), but those are not (re)used for future searches.

My Search Is Faster the Second and "X" times I run it. Does Splunk Hold Data In Memory or Cache It? Or is it a Disk Thing?

search performance

Edit

Edit

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

They're back! Join the SplunkTrust and MVP at .conf24

Enterprise Security Content Update (ESCU) | New Releases