I have a massively complex search that's working. But now I'd like to augment the output of that search with some additional fields, which can be found by using a secondary search. For this to be efficient, I need the output of the core search to be fed as parameters of the secondary search..... (Basically, I'm looking for a "lookup", but a lookup that's based of another search not a CSV file, script, or kv-store.) I'm really only dealing with one or two results at a time, so the typical inefficiencies of launching multiple searches is not a concern here.
It seems like this should be possible with the appendpipe
search command in combination with the map
command. Instead of trying to make this work in the context of my already complex search, I broke it down into it's simplest form.
This search works, demonstrating the the "map" works as-expected:
| stats count | eval series="splunkd" | map search="search index=_internal source=*metrics* group=per_sourcetype_thruput series=$series$ | head 1 | table series kb max_age"
The output is: series=splunkd,kb=10.49353 max_age=1
This also works, demonstrating that the field "series" makes it from the base search into the subsearch, just as appendpipe advertises:
| stats count | eval series="splunkd" | appendpipe [ eval new_field=series ]
The output looks like so:
However, once combined, something goes (silently) wrong:
| stats count | eval series="splunkd" | appendpipe [ map search="search index=_internal source=*metrics* group=per_sourcetype_thruput series=$series$ | head 1 | table kb series max_age" ]
The output looks like:
I was expecting the output to look like:
In real life, the first result would have lots of other useful fields. And I'd stick something like | stats values(*) as * by series
to group all the relevant fields into a single result.
Any thoughts? I've been testing this on Splunk 6.2
Update: Here's a slightly better example query and my current workaround.
index=_internal source=*metrics* group=per_sourcetype_thruput
| stats sum(ev) as ev by series
| sort 1 - ev
| appendpipe [ map search="search index=_internal source=*metrics* group=per_sourcetype_thruput series=$series$ | head 1 | table series kb" ]
| selfjoin series
The work around doesn't use map
, because without appendpipe
there's no way to "pass in" fields. So instead we have to use a subsearch, which essentially requires repeating the entire base search, which works but isn't very efficient. Here's the search:
index=_internal source=*metrics* group=per_sourcetype_thruput
| stats sum(ev) as ev by series
| sort 1 - ev
| append [
search index=_internal source=*metrics* group=per_sourcetype_thruput [
search index=_internal source=*metrics* group=per_sourcetype_thruput
| stats sum(ev) as ev by series
| sort 1 - ev
| return series ]
| head 1
| table series kb ]
| selfjoin series
Update 2: For whatever it's worth, the problem with the map
is not field substitution, because this does not work:
| stats count | appendpipe [ map search="search index=_internal source=*metrics* group=per_sourcetype_thruput series=splunkd | head 1 | table series kb" ]
Weird.
Someone from Splunk might confirm this, but on my reading of the docs for append pipe the [ ] constructor is not a subsearch, but a pipeline. Meaning that all the field values are taken from the current result set, and the [ ] cannot contain a subsearch. If you try to run a subsearch in appendpipe, ie |appendpipe [[]]
, you will get the following parser error
Error in 'SearchParser': Subsearches are only valid as arguments to commands.
Again, the appendpipe [] syntax does not indicate a subsearch, but a different constructor. The map command in your appendpipe probably encounters this parser error but it gets silently dropped. The result of that is of course, NULL. so the result set passed pack up to appendpipe is also NULL, hence why your final result is just count=0, series=splunkd, the same as if you had not had used appendpipe at all. Interestingly, |appendpipe []
provides a different result. It takes the values count=0, series=splunkd, does nothing to them , then appends them to your original result, so the final result set is
count=0, series=splunkd
count=0, series=splunkd
Ignore my earlier answer. It is incorrect (maybe someone can downvote it?) The answer is yes you can use it, but it seems to run only once, and I can't figure out how to pass values to it.
Here's a wacky run anywhere
index=_audit | head 10 | stats count by host | appendpipe [ map search="search | head 5 | fields _raw host"]
That should run map 50 times, but looks like it just runs the once,
Maybe one of the Splunk developers can explain?
From what I read and suspect...
The "appendpipe" command looks to simply run a given command totally outside the realm of whatever other searches are going on...and append those results to the answerset.
Thus, in your example, the map command inside the appendpipe would be ignorant of the data in the other (preceding/outside) part of the search. As such, indeed, it would only run one time.
There are a LOT of people seeking ways to do some similar things (including me...as I want to do a sub-search based on data from elsewhere) and it's not easily intuitive to do so.... 🙂
updated for clarity
Hmm, it looks like a simple | append [[]]
give the same error, which I suspect is simply because it's nonsensical. In particular, there's no generating SPL command given. So a search like | appendpipe [ search [ search ] ]
does "work", but doesn't do anything useful.
I agree that there's a subtle difference between the way that a subsearch works and the way the pipeline works. So with the example of | appendpipe [ (#1) search [ (#2) search ] ]
, with search #1, that is a post-filtering operation, where as search #2 is a generating command. But I think you still have to have some SPL command in #1, because a subsearch can only be used within certain commands, like search
, append
, join
, and so on.
But I'm still not sure why this would limit the use of the map
command from within the context of an appendpipeline
subsearch/subpipeline.
Derp yep you're right [ [] ] does nothing anyway.
Here's a run everywhere example of a subsearch running just fine in appendpipe
index=_audit | head 1 | stats count | eval series="splunkd" | appendpipe [ search index=_audit [ search index=_internal | head 50 | fields host ] | stats count by host | rename host as series ]
So I am incorrect in saying that a subsearch won't function in an appendpipe. So ignore that bit.
Seriously, I was amazed why the appendpipe did not work. still puzzled. Meanwhile, I don't know how feasible it will be with your complex search, something like this can generate the output that you seek.
| stats count | eval series="splunkd" | append [ | stats count | eval series="splunkd" | map search="search index=_internal source=*metrics* group=per_sourcetype_thruput series=$series$ | head 1 | table kb series max_age" ]
So it's interesting to me that the map
works properly from an append
but not from appendpipe
. (This may lend itself to jplumsdaine22 note about subsearch vs pipeline)
And yeah, my current workaround is using a bunch of appends and subsearches to get what I need. The key difference here is that the value of "series" isn't know ahead of time, it's determined by an earlier search, so setting the value within the append sub-search isn't really an option for me.
So my search ends up looking like this:
<base search returning a,b> | append [ sourcetype=c [ <base search returning b> | return b ] | manipulate ... | table c ] | stats values(*) as * by b | table a b c
And of course this simplification ignores the fact that if the base search returns nothing, the parent search essentially goes wide open and returns too many results, leading the awkward solution like so:
<base search returning a,b> | append [ sourcetype=c [ <base search returning b> | append [ stats count | eval b="NeverOccuringInRealLife" | fields b ] | return 2 b ] | manipulate ... | table c ] | stats values(*) as * by b
May need to switch over to handling all this logic externally and use some REST APIs; which is just a bit frustrating because SPL is so close to being able to handle this natively.
Did anyone ever explain this properly? Might be one for the slack channel
Also, I just saw the 2nd comment at the bottom of this page and it says the map inside subsearch doesn't support values substitution (from base search).
http://docs.splunk.com/Documentation/Splunk/6.2.0/SearchReference/Map