Splunk Search

How to Manipulate Multiple Lookup Results

lguinn2
Legend

I have a lookup that returns multiple matches. Here is a simple example:

... | lookup emp-lookup dept OUTPUT employeeId employeeName
| sort dept employeeId employeeName
| table dept employeeId employeeName

This gives output that looks like this:

dept         employeeId           employeeName
HR           0002                 Pat
             0100                 Lisa
             0003                 Renee
Sales        0011                 Hon
             0008                 Ronnie

Problem 1 - The sort doesn't really work, because employeeId and employeeName are actually multi-valued fields. It only sorts the dept field. How can I make the list sort properly?

Problem 2 - If one of the departments has a really large number of employees, like 97, only the first 50 or so are shown, and then the last thing in the list says [and 47 more] or something similar. How can I show all the employees?

Since the lookup results are stored as multi-valued fields, I thought - mvexpand! But if I do that, the relationship between employeeId and employeeName is broken. For example:

... | lookup emp-lookup dept OUTPUT employeeId employeeName
| mvexpand employeeId
| table dept employeeId employeeName

Gives:

 dept         employeeId           employeeName
 HR           0002                 Pat
                                   Lisa
                                   Renee
 HR           0100                 Pat
                                   Lisa
                                   Renee
 HR           0003                 Pat
                                   Lisa
                                   Renee
 etc...

What a mess! I am out of ideas...

Tags (1)
1 Solution

MHibbin
Influencer

Can you not split it into two lookups (haven't tried this, just a bit of lateral thinking)...

So do a first lookup for the employee IDs only:

... | lookup emp-lookup dept OUTPUT employeeId

Then mvexpand that:

...|mvexpand employeeId

And then lookup the employeeName based on the employeeID

...| lookup emp-lookup employeeId OUTPUT employeeName

And then I think the fields should be properly populated to allow you to sort as you desire.

Just an idea, as I say I haven't tested it myself.

View solution in original post

DalJeanis
Legend

Updated - This way is simpler...

| lookup emp-lookup dept OUTPUT employeeId employeeName
| eval myFan=mvrange(0,mvcount(employeeId))
| mvexpand myFan
| eval employeeId=mvindex(employeeId,myFan)
| eval employeeName=mvindex(employeeName,myFan)

MHibbin's answer is perfect for the specific question, and this is an old question, but here's another approach that is useful in some circumstances, such as where the data to be stitched together did not come from a lookup, or the lookup was too expensive to be repeated multiple times on each record -

| lookup emp-lookup dept OUTPUT employeeId employeeName
| eval IDAndName = mvzip(employeeId,employeeName," !!!! ")
| fields dept IDAndName
| mvexpand IDAndName
| rex field=IDAndName "^(?<employeeId>[^!]+) !!!! (?<employeeName>.+)$"

I use multiple exclamation points as the delimiter because they never appear that way in my data, names and addresses and such, whereas commas and semis often do.

0 Karma

Drainy
Champion

Just to throw an alternative out there as this is something I hit today. I actually went about this a different way as I generally try to avoid using lookups in the search bar where I can.

My case was nearly identical in that I had a few single value fields merged with multi values from a lookup.

The CSV I work from is reloaded every day at midnight, at 1am I run a scheduled search to play with the results, What I do is create a new field which is the result of the three fields I am interested in appended together. E.g.

Field1=hello Field2=bye NewField=hello-bye

I then use this value in my results and all I have to do is an mvexpand and then a rex or eval to split them apart again, the performance hit is pretty much nowt and I'm only hitting a lookup once so my automatic lookup stays intact if anyone tries to run an ad-hoc search.

lguinn2
Legend

I like this idea, too. I was also wondering whether it would be worthwhile to write a custom command to "normalize" and "denormalize" events - like mvexpand and mvcombine, but treating the mult-valued fields as an embedded table instead of independently.

Thanks!

0 Karma

MHibbin
Influencer

Can you not split it into two lookups (haven't tried this, just a bit of lateral thinking)...

So do a first lookup for the employee IDs only:

... | lookup emp-lookup dept OUTPUT employeeId

Then mvexpand that:

...|mvexpand employeeId

And then lookup the employeeName based on the employeeID

...| lookup emp-lookup employeeId OUTPUT employeeName

And then I think the fields should be properly populated to allow you to sort as you desire.

Just an idea, as I say I haven't tested it myself.

MHibbin
Influencer

🙂 Ha! Hate when that happens, and you focus on one thing!

Happy to help! 🙂

0 Karma

lguinn2
Legend

Doh! I was so enmeshed in thinking about multi-value fields that I didn't think it through! Of course, AFTER the second lookup, every occurrence of an employee name will be in a separate event - and therefore perfectly sortable.

BTW, I implemented this solution. Thanks!!

0 Karma

MHibbin
Influencer

well, couldn't you just do the sorting after, and then you should be able to sort as per your requirements, couldn't you?

0 Karma

lguinn2
Legend

Good idea! If I could mvexpand, it would solve the "[and 47 others]" problem. And it would work even for the sorting - as long as I want to sort by the unique id (employeeId).

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...