Knowledge Management

Is there any other way to improve performance of large KV store lookup?

alexlombardi
Observer

I have a KV store lookup with approximately 500k rows that is used on a regular basis in a number of searches we run daily to output enrichment data to our logs (email, phone number, location, etc.). When this lookup is used in conjunction with a | lookup command against a large search I can see in the inspector that the lookup command is taking almost 2 minutes at times to run through these large list searches. I've already tried using accelerated fields and minimizing the output fields, but saw little improvement. Is there anything else people use to improve search time performance on a KV store lookup when running against a large amount of data? On average the lookup for each large search is probably referencing between 20k-100k entries of the 500k total and outputting 5-7 fields.

Labels (1)
0 Karma

AnilPujar
Path Finder

@niketn 

I have 2 kv_store of each around 200,000 records and each kv_store of around 50MB size

| inputlookup kv_store1
| lookup kv_store2 common_field output req_field1

This query takes around 500 to 600 seconds to run.

any suggestions to get the result in around 10 seconds?

Please suggest

0 Karma

niketn
Legend

@alexlombardi you can try out following two options:

  1. Accelerated fields for the fields which are used for joining your SPLs to the KVStore (restrict fields to key input fields rather than enrichment output fields). Refer to answer: https://answers.splunk.com/answers/333540/for-large-lookups-should-i-use-the-kv-store.html

  2. Indexer Replication for KV Store Knowledge Objects (this takes disk and network for replicating KVStore data as CSV bundle to indexer). This will allow map-reduce advantage in case only streaming commands are used prior to KVStore lookup. Refer to answer: https://answers.splunk.com/answers/390686/how-to-replicate-a-search-head-clusters-kv-store-l.html

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

alexlombardi
Observer

@niketnilay

Thanks for the suggestions. I am already replicating this KV store to our indexers, however I haven't seen much of a difference between using the local copies on the SHs versus using the indexer copies. As for the accelerated fields, I currently heave a setting for each field and I have not tried combining them as of yet. I'll need to test if there is any real difference in performance between the too.

0 Karma

niketn
Legend

@alexlombardi using index replication you cant take the advantage of map-reduce if lookup command is used after a transforming/non-streaming command and lookup is processed on SH instead of Indexers.

Refer to the following documentation: https://docs.splunk.com/Documentation/Splunk/latest/Search/Writebettersearches#Parallel_processing_e...

https://docs.splunk.com/Documentation/Splunk/latest/DistSearch/Parallelreduceoverview

Also check out lookup command optimization: https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Lookup#Optimizing_your_lookup_se...

Lookup command can be forced to run on SH but by default runs on Indexer provided all the commands prior to lookup are purely streaming. You would need to run performance tests to ensure position of lookup command as sometimes map-reduce can get you better results and other times transforming commands can reduce the result set to too low.

We have seen advantage of both the steps. Index replication of KV Store and accelerating only the required field usually it is only one or two per KV store like unique ID field or unique Name field. We store 3-5 million rows in the KV Stores.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...