Splunk Search

Why is my map command returning an error when there are no results from the main search?

packland
Path Finder

Hi,

I'm having issues where the map command returns an error when there are no results from the main query. In my use case, this is a perfectly reasonable scenario.

If there are no results from the initial search, then there is no need for the map command to execute at all but instead it tries to execute the search without any tokens.

index=myIndex value1!=True
| stats count by siteID
| map [search index=myIndex earliest=-2d value2!=True siteID=$siteID$
      | stats latest(_time) as lastContact by siteID, siteName, region, siteType]

So if the main search doesn't return anything, I get an error: Error in 'map': Did not find value for required attribute 'siteID'.

This wouldn't be too bad but the results from this search are being appended to another search and this error causes both searches to fail, regardless of if the first search was successful or not.

Any pointers on getting this map command to play nice would be greatly appreciated.

Thanks

1 Solution

elliotproebstel
Champion

Well, you can solve the current problem with a simple fillnull:

index=myIndex value1!=True
| stats count by siteID
| fillnull value="" siteID
| map [search index=myIndex earliest=-2d value2!=True siteID=$siteID$
      | stats latest(_time) as lastContact by siteID, siteName, region, siteType]

That will eliminate the errors for the search code as it is currently written.

However, I strongly suspect there is a better way to structure this code so that it's not using map here at all, especially if (as the snippet suggests), your map search is iterating over the same indexed data as the primary search that feeds it. Without any other context, my intuition is that you're finding siteID values in the primary search from a different time window than the one you're using in your mapped subsearch. And I can totally relate to the desire to structure code like this, but it's actually not the efficient way to do things in Splunk. If you'd like help re-writing this so that your search is more efficient and less brittle, feel free to post more details (either here in response to my post, or in a new "How do I make this search more efficient?"-type post).

But, as an FYI, if you decide to stick with map on this - you'll want to add maxsearches=x where x is the maximum number of iterations you want that map command to run. If you want to live dangerously and allow it to run for as many siteID values as the primary search finds, you can use maxsearches=0. If you don't specify a value for this attribute, the map command will max out at 10 iterations of the subsearch.

View solution in original post

woodcock
Esteemed Legend

The fillnull command is not the right way to do it and it will not work for all versions. Here is an approach that will work for all versions of Splunk. Essentially, you create a fake/placeholder event before calling map, ignore it inside and then throw it away at the end. So it looks like this. Let's say that you are working on a field called sid and have other fields going into the map. You would instrument this solution like this:

... | rename COMMENT1of3 AS "Without the placeholder event, when there are no matching events,"
| rename COMMENT2of3 AS "the 'map' call will generate a 'field not defined' error."
| rename COMMENT3of3 AS "This placeholder event is dropped during/after the 'map' call."
| appendpipe [
    | stats count AS placeholder
    | where placeholder == 0
    | eval sid = "PLACEHOLDER"
]

| map maxsearches=9999 search="
    ... | <commands and stuff here> $sid$ ... | blah
"

| search NOT sid = "PLACEHOLDER" ...

nick405060
Motivator

Good workaround, upvoted.

However to any Splunk developers looking at this question, and wondering whether something should be done about this... the answer is yes. copying and pasting my usergroups comments for posterity:

Nick 3:01 PM
I'd buy the argument that it is not really a bug; however it can be pretty misleading. e.g. how should a user know that the reason for the error is no results being returned? they might just assume there are in fact results, but that there's a bug in the SPL (aka my users). which you can't blame them for, considering they get two error messages one of which says "the search job failed." lol (edited) 
Nick 3:08 PM
you should not have to have infinite knowledge of splunk in order to use splunk.
0 Karma

natzha
Explorer

Thanks. Solved my question.

0 Karma

morethanyell
Builder

this is how it should be done. upvoted

0 Karma

andygerberkp
Explorer

In a similar vein, if you are not using a stats comment, you can simply append / makeresults to create a dummy result to feed to | map.

| append 
    [| makeresults 
    | eval siteID="DUMMY"]
0 Karma

elliotproebstel
Champion

Well, you can solve the current problem with a simple fillnull:

index=myIndex value1!=True
| stats count by siteID
| fillnull value="" siteID
| map [search index=myIndex earliest=-2d value2!=True siteID=$siteID$
      | stats latest(_time) as lastContact by siteID, siteName, region, siteType]

That will eliminate the errors for the search code as it is currently written.

However, I strongly suspect there is a better way to structure this code so that it's not using map here at all, especially if (as the snippet suggests), your map search is iterating over the same indexed data as the primary search that feeds it. Without any other context, my intuition is that you're finding siteID values in the primary search from a different time window than the one you're using in your mapped subsearch. And I can totally relate to the desire to structure code like this, but it's actually not the efficient way to do things in Splunk. If you'd like help re-writing this so that your search is more efficient and less brittle, feel free to post more details (either here in response to my post, or in a new "How do I make this search more efficient?"-type post).

But, as an FYI, if you decide to stick with map on this - you'll want to add maxsearches=x where x is the maximum number of iterations you want that map command to run. If you want to live dangerously and allow it to run for as many siteID values as the primary search finds, you can use maxsearches=0. If you don't specify a value for this attribute, the map command will max out at 10 iterations of the subsearch.

packland
Path Finder

Such a simple solution, thanks so much. And yes you're right I'm using the map command primarily because I don't want the main search to run over 2 days worth of data because in that time there are almost 1 million events, and it seemed faster to wait until the search had been narrowed down to the point where it was only searching for one or two sites rather than all of them. In saying that, the search does feel slow and inefficient and I would like to improve it.

Basically this search is looking for devices that have gone offline and then the map command will search a maximum of 2 days back to find out how it's been since the device was online. So basically I only want recent events for the first part of the search, and historical data for the second part of the search in order to create a downtime column in the final results. Any ideas?

Thanks again for your answer.

0 Karma

elliotproebstel
Champion

Glad to help! And here's what makes map a terribly inefficient way to do almost anything. When Splunk runs a search, it allocates a decent number of distinct resources on the search head, including a processor core, to that particular search. When you use map, the primary search produces some number of results to feed into the map subsearches, and then Splunk creates a whole new search for each result that is going into the subsearch. So if the primary search returns four values for siteID, it's NOT like Splunk is running:

index=myIndex earliest=-2d value2!=True siteID=value1 OR siteID=value2 OR siteID=value3 OR siteID=value4

Rather, Splunk is running four individual searches:

index=myIndex earliest=-2d value2!=True siteID=value1

and

index=myIndex earliest=-2d value2!=True siteID=value2

and

index=myIndex earliest=-2d value2!=True siteID=value3

and

index=myIndex earliest=-2d value2!=True siteID=value4

...with all the resource allocation that those searches imply.

So one way to rewrite the search would be to reverse the order of primary search vs. subsearch. The way your post is written, I know what the intended timeframe was for the original subsearch but not for the original primary search, so I'll structure this as though your original primary search was looking at the last hour. Adjust to your actual needs accordingly.

index=myIndex earliest=-2d value2!=True 
 [ search index=myIndex value1!=True earliest=-1h 
  | fields siteID 
  | format ]
| stats latest(_time) as lastContact by siteID, siteName, region, siteType

This approach will still search the smaller time window first, but (assuming the subsearch returns 4 values), it will expand out to:

index=myIndex earliest=-2d value2!=True siteID=value1 OR siteID=value2 OR siteID=value3 OR siteID=value4
| stats latest(_time) as lastContact by siteID, siteName, region, siteType

So that's already a big improvement. If the subsearch always completes quickly and returns a small number of values, that should work reasonably well.

While you're trimming, you might also try replacing those instances of value1!=True and value2!=True. It is almost always faster to run a Splunk search for something positive (e.g. value1=False) than to search for the negation of something. If the list of possible values for value1 and value2 are short, you can try running your search the way it's written and then run it with value1=False OR value1=KindOfFalse OR value1=NotTotallyTrue (or whatever settings value1 might take that preclude value1=True) - and see if this speeds up the search. It doesn't always, but it often does.

packland
Path Finder

I had no idea you could expand searches like that. That is incredibly useful! I implemented your suggestions and the search runs much faster (also with less headache).

Thank you again for your help. I'll definitely be using this more in the future.

This actually also fixed another issue I was having where some sites wouldn't appear if they hadn't been online at some point during the last 2 days.

elliotproebstel
Champion

Really glad to help! If you have other searches that are taking longer (or taking more resources) than you think makes sense, feel free to ask on here. There are lots of us who enjoy helping/explaining and making searches run more efficiently!

0 Karma

HiroshiSatoh
Champion

Try this!

 index=myIndex value1!=True
 | stats count by siteID
 | map search="search index=myIndex earliest=-2d value2!=True siteID=\"$siteID$\"
       | stats latest(_time) as lastContact by siteID, siteName, region, siteType"

※When siteID is a character string

0 Karma

packland
Path Finder

Thanks for the response, although this doesn't seem to fix my issue. siteID will always be a 5 digit integer (or nothing if there are no results in the main search)

0 Karma

HiroshiSatoh
Champion

I inserted a dummy.
In case of "*", all are extracted.

 index=myIndex value1!=True| append [search |noop|stats count|eval siteID="*"]
  | stats count by siteID
  | map search="search index=myIndex earliest=-2d value2!=True siteID=$siteID$
        | stats latest(_time) as lastContact by siteID, siteName, region, siteType"
 | eventstats count as all
 | where (all=1 and siteID="*") OR (all>1 and siteID!="*")
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...