Let's say I have a search that immediately goes into a lookup with a filtered kvstore of 1 million events followed by a stats by a lookup output field:
index=my_index | lookup ioc_sha256 Indicator_Value AS sha256 OUTPUT Type Malware | stats first(_time) AS _time first(field1) AS field1 values(Malware) AS Malware etc.. by Type sha256
How can the performance of that search be improved? The goal is to show all match_field matches with other event level information from the event matches.
There are 2.5 efficient ways to run large lookups:
Option 1, bigger-limits CSV files. If you have the memory, increase this setting to accommodate your CSV file (default is 10MB):
[lookup]
max_memtable_bytes=10000000
That will make Splunk build the index structure in memory, giving you enormously fast speeds of about 8µs per event looked up. Comes with a penalty of about 4s index-building overhead per search though, so this is only best for looking up lots of events in one go. My 82MB lookup file produced about 600MB additional search process memory footprint.
accelerated_fields.ioc_accel = {"sha256": 1}
or whatever your field name is. That'll tell mongoDB to do magic, giving me about 15µs per event looked up and no penalty for single-event searches or enormous memory footprint. For comparison, an unaccelerated KV Store lookup gave me about 6000µs per event looked up, a 40x speedup.Performance numbers are based on my home Splunk, 7.1.2 running on Windows, and 1M randomly generated SHA256 values with just a count as lookup output fields.
I'd go with option 2, accelerated-field KV Store. You get most of the speedup for the least penalties.
There are 2.5 efficient ways to run large lookups:
Option 1, bigger-limits CSV files. If you have the memory, increase this setting to accommodate your CSV file (default is 10MB):
[lookup]
max_memtable_bytes=10000000
That will make Splunk build the index structure in memory, giving you enormously fast speeds of about 8µs per event looked up. Comes with a penalty of about 4s index-building overhead per search though, so this is only best for looking up lots of events in one go. My 82MB lookup file produced about 600MB additional search process memory footprint.
accelerated_fields.ioc_accel = {"sha256": 1}
or whatever your field name is. That'll tell mongoDB to do magic, giving me about 15µs per event looked up and no penalty for single-event searches or enormous memory footprint. For comparison, an unaccelerated KV Store lookup gave me about 6000µs per event looked up, a 40x speedup.Performance numbers are based on my home Splunk, 7.1.2 running on Windows, and 1M randomly generated SHA256 values with just a count as lookup output fields.
I'd go with option 2, accelerated-field KV Store. You get most of the speedup for the least penalties.