Using an append command, it seems I can successfully set the maxout to a number less than 50000, but not increase it to anything higher?
e.g. if maxout=50001 I get only 50000 results
Is this observation correct? If so is there any way to override with a larger number?
Correct, that is the maximum limit. But if you share your data and your goal, it is highly likely that we can come up with a solution that does not use subsearches
.
Correct, that is the maximum limit. But if you share your data and your goal, it is highly likely that we can come up with a solution that does not use subsearches
.
Thanks - I see the entry in limits.conf which can be increased but I'm not anxious to do that as you can imagine.
Here's a summary of the use case below. I'm exploring several different approaches - subsearches, lookups using outputlookup, intermediate .csv files using outputcsv, transactions, joins... it seems like a simple scenario and is actually quite malleable, however several of the approaches run out of steam on scalability as the number of events gets large, others on performance...
Scenario: Splunk query to determine whether a new transaction which is performed by a company in the past hour has any historical record.
A transaction is deemed to have historical record if there is a similar transaction performed by the same company in past 90 days having the **same beneficiary name OR beneficiary account number**
Given your description, this should work (when run over "last hour")
... | dedup beneficiaryName beneficiaryAccountNumber | map search="search earliest=-90d@d latest=-60m beneficiaryName=$beneficiaryName$ OR beneficiaryAccountNumber=$beneficiaryAccountNumber$"
Nice, thank you. Beautifully simple and elegant. Only concern is scalability/ performance as the map search runs once per input and that can be a very large number. Something to test 🙂
It will scale without breaking (which was your problem) but it is slow and will get slower but that's a good tradeoff for not breaking down completely.
Interestingly, there seems to be some limiter on number of results but I can't find any entry in limits.conf to explain it.
In my lab environment, it behaves fine until I exceed 60000 results, once I exceed that it returns only the first 50000 results. Up to that point performance seems linear, execution time increasing with the number of results as expected.
You can add max_searches=9999
or similar to the map
command, otherwise it defaults to 10
, I think.
OK, I have another way; try this (run for last 90 days):
... | addinfo | eval type = ((info_search_time = _time) <= 3600), "LAST_HOUR", "LAST_90_DAYS") | fillnull beneficiaryName beneficiaryAccountNumber | stats values(type) AS Types dc(type) AS NumTypes by beneficiaryName beneficiaryAccountNumber | eval RecordHistory=if((NumTypes==2), "YES", "NO")