Splunk Search

Speed impact of field extractions?

dpadams
Communicator

I've got a custom log format using a format similar to an Apache access log but with different data. I've used the interactive field extractor to teach Splunk all of the fields in the table. As my Splunk is quite slow, I'm wondering if there's a speed impact from each field extraction rule.

Should field extraction rules degrade search performance? If so, can anyone point me to the best way to optimize this? As far as I could see from the docs, the normal practice is to define field extractions this way rather than at index time. (That seems backwards to me as a database person but the docs seemed pretty clear on this point.)

Thanks for any guidance, still finding my feet with Splunk.

0 Karma
1 Solution

Stephen_Sorkin
Splunk Employee
Splunk Employee

The easiest way to determine the impact of field extractions is to check the Search Inspector in the Actions menu while the search is running or after it has completed. If the extraction itself is slow, it'll be reflected under the "kv" component of the search time. However, more often, the creation of fields is slower as it means that the "timeline" operation takes more time in constructing field summaries.

It is true that index time fields very rarely help performance.

View solution in original post

Stephen_Sorkin
Splunk Employee
Splunk Employee

The easiest way to determine the impact of field extractions is to check the Search Inspector in the Actions menu while the search is running or after it has completed. If the extraction itself is slow, it'll be reflected under the "kv" component of the search time. However, more often, the creation of fields is slower as it means that the "timeline" operation takes more time in constructing field summaries.

It is true that index time fields very rarely help performance.

rroberts
Splunk Employee
Splunk Employee

Thanks for this!

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

dispatch.timeline means collecting statistics about all extracted fields. command.search is a subset of dispatch.fetch, and contains a bunch of components. You are correct, field extractions aren't for speed but rather for correctness and simplicity of retrieval and reporting.

Regarding your search, "red rover" is going to be marginally less efficient, because of the wildcards, but if you don't see that reflected in the inspector, you're fine in this case. This is because we optimize the search by first looking for the field value as if it were a term, then extract and filter these events.

0 Karma

dpadams
Communicator

command.search
dispatch.fetch
dispatch.timeline

It seems that the point of a field extraction is not so much to speed up a search. Is the idea to make the search statements clearer and to make reports possible? For example, it's easy to imagine a search that then wants to sum/count/top by a custom field such as internal_user_id or sales_region. Do you reckon I'm on the right track here? I've started to get religion about Splunk but am still pretty new to it. I've been at it for a few weeks but most of the time has been spent prepping logs, configuration files and automating distribution.

0 Karma

dpadams
Communicator

Thanks very much for your answer - I've been enjoying some of your videos recently. I had not noticed the search inspector, it looks great. I've tried out a small comparison and am unclear what the results indicate. I've got a JSON payload in my logs in an extract field called json_post_data. If I want to find the phrase "red rover", I can do a free search for "red rover" or a field-specific search for json_post_data="red rover". The search inspector shows somewhat different execution paths for the two searches, but overall similar performance. The most time consuming elements appear in both

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...