Splunk Search

Depth of nested join and sub search

gregb
Explorer

I have an odd problem related to nested joins on 4.3.2. I am attempting to put together a report on latency across all components in an event based architecture.

An event enters ComponentA -> ComponentB -> ComponentC -> ComponentD -> ... -> ComponentZ -> generate notification event (all with different correlationIds across each hand off).

I am running into the situation where the 3rd nested join/sub search fails in the complex query, while exactly the same query (minus the nested "[search ... ]") which reports on ComponentB->ComponentC works as expected.

Anyone have any idea?

The initial chain:
index=myindex eventtype="compA_start" compA_entry_name=bookTrade
| dedup request_id
| join request_id [search index=myindex eventtype=compA_end
| eval end_time=time ]
| fields end_time, request_id
| eval compA_total_time = end_time-_time
| join request_id [search index=myindex eventtype=compB

| rename req_id as request_id
| dedup request_id
| eval compB_start_time = _time - all
| rename trannum_e as compC_tran_num
|rename all as compB_total_time
| rename source as compB_source
| join outer compC_tran_num [search index=myindex sourcetype=compC
| eval compC_time = _time
| rename compC_sequence_number as evtId ]]

This works in isolation:

index=myindex sourcetype=compC
| eval compC_time = _time
| rename compC_sequence_number as evtId
| join outer evtId [search index=myindex eventtype=compD_notification
| eval compD_key_time = _time ]

index=myindex eventtype="compA_start" compA_entry_name=bookTrade
| dedup request_id
| join request_id [search index=myindex eventtype=compA_end
| eval end_time=time ]
| fields end_time, request_id
| eval compA_total_time = end_time-_time
| join request_id [search index=myindex eventtype=compB

| rename req_id as request_id
| dedup request_id
| eval compB_start_time = _time - all
| rename trannum_e as compC_tran_num
| rename all as compB_total_time
| rename source as compB_source
| join outer compC_tran_num [search index=myindex sourcetype=compC
| eval compC_time = _time | rename compC_sequence_number as evtId
| join outer evtId [search index=myindex eventtype=compD_notification
| eval compD_key_time = _time ]
]]

I have also validated that the evtId is populated in the reports for each.

Tags (2)
0 Karma

cphair
Builder

You need to use join type=outer, not join outer. I would guess the second way looks for a field called outer, doesn't find it, and falls back to an inner join on compC_tran_num. I just tried both syntax versions on my own data, and they do return different results.

That said, three nested subsearches is going to be pretty slow. Have you looked into using variants on "transaction request_id"?

0 Karma

gregb
Explorer

Let me try the type=outer, but I think I tried with just "join" already. The problem with the transaction is that it requires a unifying correlation id across all the events and there isnt one. It needs to be stitched together from one log file to the next via differing correlation ids

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...