Splunk Search

Will Lookaheads/Lookbehinds Hurt Search Performance?

skoelpin
SplunkTrust
SplunkTrust

I have an index which processes around 10 million events per day. I did a few field extractions which had lookaheads and lookbehinds. Will this hurt my search performance with such massive volumes?

0 Karma
1 Solution

woodcock
Esteemed Legend

IMHO, you should avoid them because it does have an impact and it does add up. On the other hand sometimes it is unavoidable. I had a production source that generated CDRs with start and stop times. For durationful events, you should ALWAYS use the stop time. However, sometimes these records had NULL stop times and we needed to use the start time as a fallback timestamp. To do this we used a lookahead for TIME_PREFIX which is a bad situation. Event though we had thousands of CDRs a second, we did not notice an impact when we deployed this change (and the cluster was not very large and had little extra horsepower). So in my limited experience, deploying one lookahead was unnoticeable but I am sure deploying dozens of them would have been. Do what you have to do and keep an eye on your situation so you can stay ahead of the performance curve and upscale your cluster horsepower as you add in things.

View solution in original post

woodcock
Esteemed Legend

IMHO, you should avoid them because it does have an impact and it does add up. On the other hand sometimes it is unavoidable. I had a production source that generated CDRs with start and stop times. For durationful events, you should ALWAYS use the stop time. However, sometimes these records had NULL stop times and we needed to use the start time as a fallback timestamp. To do this we used a lookahead for TIME_PREFIX which is a bad situation. Event though we had thousands of CDRs a second, we did not notice an impact when we deployed this change (and the cluster was not very large and had little extra horsepower). So in my limited experience, deploying one lookahead was unnoticeable but I am sure deploying dozens of them would have been. Do what you have to do and keep an eye on your situation so you can stay ahead of the performance curve and upscale your cluster horsepower as you add in things.

skoelpin
SplunkTrust
SplunkTrust

Good explanation.. I'm doing the extractions now and we expect a large increase in events in the future..Even though it's not affecting performance much right now, I don't want to hurt myself in the future

0 Karma

woodcock
Esteemed Legend

It is the same as subsearches and transaction. I bend over backwards to avoid using them (poor performance) but sometimes it is the only way to do it. You've gotta do what you've gotta do.

0 Karma

bmacias84
Champion

Possibly, depending on how many steps it takes to match. If it takes 50 steps probably not, but it it takes 2500 steps to match per events then, possibly. Just look at when you do a fast search vs verbose/smart. The more regexes you apply and the less effecting the regex the slower your searches.

Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...