Splunk Search

Is there an efficient way to search for entries containing a specific substring of an indexed word?

johnmccash
Explorer

I'm not entirely certain exactly how the search optimization in Splunk works. Certainly, if I search only for a rare indexed word, then all entries that contain that word will be found quickly. But what if I want to search for a substring of a rare indexed word, which is itself rare. Say for the sake of argument that this rare substring only occurs in one indexed word. I can search for the substring bracketed by asterisks, but that seems to take significantly longer than the search for the rare indexed word that the substring is part of.

Is there an efficient way to do a search like this directly? Failing that, is there a way to list all indexed words that contain a common substring? If I had that list for a given substring, I could simply search for all instances of the indexed words that contain the substring.

Thanks

0 Karma

lguinn2
Legend

Whenever a search term begins with a wildcard, the search will be particularly slow. Using wildcards forces Splunk to serially scan the lexicon to find any matching keywords for each bucket. Search terms that end with a wildcard are not as slow as search terms that begin with a wildcard.

If Splunk knows the exact search term, it can use the index to find it directly. It can also use bloom filters to eliminate many buckets from the search. Bloom filters do not work with wildcards.

There is no way to use a wildcard while avoiding the performance penalty of using a wildcard.

However, if you can narrow the search by including additional terms, that will help. For example, be sure to specify the index and the sourcetype. Also, use as narrow a timerange as possible for your search. Anything that helps Splunk reduce the number of buckets to scan, will be good.

johnmccash
Explorer

So.... Say I'm searching for the string "bcdefghijklmnop", which occurs exactly once in my entire (large) dataset. The one time this occurs, it does so as " abcdefghijklmnopq " (but, of course, I don't know what the leading & trailing characters are). Are you saying that the only way to find this instance is to search for

"*bcdefghijklmnop*"

?

Thanks

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...