Splunk Search

Unique KeyValue search performance

splunkrg
Explorer

Hey Everyone,

I'm having a bit of trouble with Splunk search performance, I currently have around 1 million rows of logs, each row approx 1kb wide that conforms to the following pattern:

SomeKey1="stringdata" SomeKey2="stringdata" SomeKey3="stringdata" KeyID="UniqueNumericID"

When I do a search on this data using a simple search query such as:

search sourcetype=sourcetypeid KeyID="1"

It takes up to 20-30secs to return the single matching event on a dedicated server (quad core xeon, 16gb ram, SATA3 SSD) using either the GUI or via the REST API. After inspecting many similar queries jobs, the largest consumer of time seems to be dispatch.fetch / dispatch.stream.local, when you take into account that I need to do this similar queries very often and programmatically, I assume the best thing to do would be extract the KeyID field at index time, would this drastically improve the search speed? Are there any other pitfalls that I may have missed?

Thanks in advance..

Tags (3)
0 Karma
1 Solution

Ayn
Legend

As martin_mueller says, it's important to know here how unique the KeyID values are - that is, not only in this specific sourcetype, but across all data in the index.

@dwaddle has explained very well the specifics of what goes on in a Splunk search here: http://answers.splunk.com/answers/54207/slow-search-when-evaluating-a-numeric-value?page=1&focusedAn...
It's a very good read and I think it answers your question. Short version here: KeyID="1" will be slow because "1" is very likely such a common token in your index, and as most fields aren't extracted until at at search-time, when you search for KeyID="1" Splunk will in practice find all events with the token "1" in them and THEN see if any of these tokens can be matched to the field "KeyID". In this scenario an index-time field extraction might be a good idea in order to improve performance.

View solution in original post

Ayn
Legend

As martin_mueller says, it's important to know here how unique the KeyID values are - that is, not only in this specific sourcetype, but across all data in the index.

@dwaddle has explained very well the specifics of what goes on in a Splunk search here: http://answers.splunk.com/answers/54207/slow-search-when-evaluating-a-numeric-value?page=1&focusedAn...
It's a very good read and I think it answers your question. Short version here: KeyID="1" will be slow because "1" is very likely such a common token in your index, and as most fields aren't extracted until at at search-time, when you search for KeyID="1" Splunk will in practice find all events with the token "1" in them and THEN see if any of these tokens can be matched to the field "KeyID". In this scenario an index-time field extraction might be a good idea in order to improve performance.

splunkrg
Explorer

Thanks for that, interesting read. I have since set up the index-time field extraction after a fair amount of pain and running the following command now takes between 100-200ms, what a difference!

search sourcetype=sourcetypeid KeyID::1

martin_mueller
SplunkTrust
SplunkTrust

Are you actually looking for a value "1" or is that just an example?

If you are, Splunk is first loading all events containing "1" and then matching them against the field you were looking for - that's not very efficient, because I assume there are many events containing "1" where KeyID isn't "1".

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...