I have been using a complex search query (it's difficult for me to post it here without exposing internal information I don't want to expose) that performs a join on a subsearch. The subsearch is looking at a lot of data, and therefor can take some time to run. I regularly get "auto-finalized after time limit reached" errors on the subsearch part of the query. I've looked into fixing this by adjusting the limits.conf or using a lookup table instead, but would prefer a solution that doesn't require either of those two.
I thought I could replace the join type=outer joinOn [search my sub search string]
with join type=outer joinOn [savedsearch subsearch1]
and then move what was previously the "my sub search string" to a saved search named "subsearch1". This does work, but I still get the auto-finalized error. I tried setting the "subsearch1" saved search to run every 15 minutes, and cache the results for an hour, but when I run the main outer search, it still seems to be running the join subsearch live (instead of using the cached results from the last time the subsearch1 saved search ran) as I still get an auto-finalized error.
Is there no way to have the join
operation performed against the most recent results from the scheduled saved search?
The solution seems to be to use | join type=outer joinOn [loadjob savedsearch="myuserid:search:subsearch1"] |
. However, to get this to work I had to do two things.
The original question/problem I had has been solved by this, but if anyone knows why 1 and 2 above apply, I would be interested in feedback.
The solution seems to be to use | join type=outer joinOn [loadjob savedsearch="myuserid:search:subsearch1"] |
. However, to get this to work I had to do two things.
The original question/problem I had has been solved by this, but if anyone knows why 1 and 2 above apply, I would be interested in feedback.
This was very helpful for my situation. Thank you!
Yes. The subsearch is basically querying the index filled by the Splunk Active Directory monitor app. As such, creating a summary index would result in roughly the same number of events (though potentially a bit less data per event) in that summary index.
And at any rate, I would really like to know if it is possible to run a join on the cached results from a scheduled search, as this seems like it could be beneficial in many other scenarios too.
Have you ruled out using summary indexes?
/k