Splunk Search

join with subsearch that has different field name, and so efficiently

myudkowsky
Communicator

I would like to join search results with subsearch results, but I need to rename or define a new field name in order to tie one search to the other properly. Unfortunately, I can't seem to get the subsearch to use that new variable name.

First, the main search:

foo | eval join_id=parentsessionid

This finds all the foo, and the parent's session id. I name this parentsessionid as "join_id" because I want to use it to join with results from the parent session. Note that both "foo" and "bar" will have sessionid and parentsessionid fields - so I have to tread carefully, and I need to carefully check the field.

Now I want to join with a subsearch:

 | join join_id [ search bar | eval join_id=sessionid ]

This would seem, in theory, to join the two togther -- the "bar" information from the parent session with the "foo" information from the child.

In all, the search looks like this:

foo | eval join_id=parentsessionid
 | join join_id [ search bar | eval join_id=sessionid ]

Unfortunately, this does not work. The output is simply the result of a simple "foo" search, as if though the "bar" search never happened.

For those who prefer a real example,

2c0657b033a076d7df0e2b7d8d4288c7 (call_start OR connectionid) 
 | eval join_id=parentsessionid
 | join join_id [ search 34ec4840b397715e47d33304ba1b9be0 (session event connection.connected) 
 | eval join_id=sessionid ]

I also realize that this seems to hideously ineffective. I'm searching over a very large number of "bar" entries and then discarding almost all of them. I wouldn't mind a tip or two on how to make the search more efficient. But at present I'm more concerned about getting it to work in the first place.

Tags (3)
0 Karma

Ayn
Legend

Generally it is wise to avoid join if possible. It's very expensive resource-wise and there's often (though of course not always) a smoother solution that's more suited for Splunk instead of being more suited for SQL. If you can find a set of eval statements that will create a join_id that comes from the parent session ID in the cases where you want that, and the current session ID where you want that, you could use transaction instead. It's admittedly somewhat resource consuming as well, but it's smoother and often makes more sense to use.

... | eval join_id=if(someconditionforparentsessionid, parentsessionid, sessionid) | transaction join_id
0 Karma

myudkowsky
Communicator

Still not able to come up with a join; I tried subsearch, and that's not producing the results I expect either.

0 Karma

myudkowsky
Communicator

Ayn,

Thanks for the suggestion. Unfortunately, I have been unable so far to come up with a transaction for this, which was my first choice. AFAICT I need the specific session ID info from the "foo" search first.

I'll try again, however, because your answer just gave me an idea for a new "if" that I haven't tried yet.

0 Karma

myudkowsky
Communicator

I should mention that the output of each individual search is, in fact correct.

That is,

foo | eval join_id=parentsessionid | stats values(join_id)

produces a the expected result.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...