Hello Splunkers.
I have the below search/subsearch which are working fine by themselves, but when I try to join them to create a 'master' list, I suddenly lose events. The main search returns approximately 9.3K events/hostnames and the sub search returns approx 11.8K hostnames. I'm expecting that joining them would return a number of hostnames, somewhere down the middle but I only get a little over 4K in return.
What am I doing wrong?
index=asset_db source="/var/asset_database/fullpull.csv" "System Name"=* NOT "Purpose2"=Farm | convert timeformat="%m/%d/%Y" mktime("Last Audit") as last_audit_time | eval timer=now()-(90*24*60*60) | where last_audit_time>timer | rename "OS Name" as OS | rename "System Name" AS hostname | eval hostname=lower(hostname) | join hostname [search index=assets source="/scratch/cadence_assets/AD-host-report.CSV" earliest=-90d@d latest=-0d@d Name=* "Operating System"=* | rename "Operating System" AS OS | rename Name AS hostname | eval hostname=lower(hostname) | fields hostname,OS]
Thanks!
the maximum number of result from the sub search is 10000.
the maximum for a join is 50000
see http://docs.splunk.com/Documentation/Splunk/6.1.4/Admin/Limitsconf
[subsearch]
* This stanza controls subsearch results.
* NOTE: This stanza DOES NOT control subsearch results when a subsearch is called by
commands such as join, append, or appendcols.
* Read more about subsearches in the online documentation:
http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutsubsearches
maxout = <integer>
* Maximum number of results to return from a subsearch.
* This value cannot be greater than or equal to 10500.
* Defaults to 10000.
[join]
subsearch_maxout = <integer>
* Maximum result rows in output from subsearch to join against.
* Defaults to 50000
subsearch_maxtime = <integer>
* Maximum search time (in seconds) before auto-finalization of subsearch.
* Defaults to 60
subsearch_timeout = <integer>
* Maximum time to wait for subsearch to fully finish (in seconds).
* Defaults to 120
the maximum number of result from the sub search is 10000.
the maximum for a join is 50000
see http://docs.splunk.com/Documentation/Splunk/6.1.4/Admin/Limitsconf
[subsearch]
* This stanza controls subsearch results.
* NOTE: This stanza DOES NOT control subsearch results when a subsearch is called by
commands such as join, append, or appendcols.
* Read more about subsearches in the online documentation:
http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutsubsearches
maxout = <integer>
* Maximum number of results to return from a subsearch.
* This value cannot be greater than or equal to 10500.
* Defaults to 10000.
[join]
subsearch_maxout = <integer>
* Maximum result rows in output from subsearch to join against.
* Defaults to 50000
subsearch_maxtime = <integer>
* Maximum search time (in seconds) before auto-finalization of subsearch.
* Defaults to 60
subsearch_timeout = <integer>
* Maximum time to wait for subsearch to fully finish (in seconds).
* Defaults to 120
Odd then that I maxed out at less than 5K. This wouldn't be the first time that the way I've crafted queries has introduced me to a maximum search result limit.
Any suggestions on how I should get those two searches joined? I'm trying to use Splunk to paint a picture of our asset inventory where if I can join the two asset logs by hostname and OS, I can understand what we have out there in our environment and then later search against that main query to use it as a living master repository list. Use it as a sub/main search against an virus scan log for example to see what machines have a virus scan utility installed on them etc.
Thanks for any assistance.
maybe this gives you a hint how this could be done http://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-joi...