Searches with the "Transaction" command for example can be really slow, What would be the best approach for speeding up searches in Splunk in terms of Hardware and Search Optimization.
Note: Please do not advise that I use Sub-searches in any way, I have given up on using them long ago!
As Kristian demonstrated, sometimes the best way to make transaction
operations faster is not to use transaction
at all. If your transaction is simple enough, you can express it using stats
with a by
clause. This is much much faster.
If your transaction is too complex to manage with stats
, then you should go out of your way to minimize the number of events dealt with. For example, if you're looking for 3 or 4 events out of 20 that make up a transaction, search only on those 3 or 4, like:
sourcetype=foo ( event 1 stuff ) OR ( event 2 stuff ) OR ( event 3 stuff ) | transaction
Also, limit the size/scope of your transactions by using startswith
/ endswith
so that transaction can "complete" them as soon as possible.
In general, try to limit the time span first. And be as specific as possible before the first pipe. The more you can limit the search before the first pipe, the better.
The following searches for unique visits (not visitors) and bytecount should result in a table like:
clientip visits sum(bytes)
10.11.12.13 42 89231
11.12.13.14 32 12212
12.13.14.15 11 12341
BAD:
sourcetype=access_combined | transaction JSESSIONID | search host=www1.company.com | stats count AS visits sum(bytes) by clientip
BETTER:
sourcetype=access_combined host=www1.company.com | transaction JSESSIONID | stats count AS visits sum(bytes) by clientip
BEST:
sourcetype=access_combined host=www1.company.com | stats distinct_count(JSESSIONID) AS visits sum(bytes) by clientip
Hope this helps,
Kristian