Splunk Search

Merge fields from distinct sources on event

LordLeet
Path Finder

Hello,

I'm trying to find the best way to do the following:
Source A: id_field1 id_field2 fielda1 fielda2
Source B: id_field1 id_field2 fieldb1
Source C: id_field1 fieldc1

I want to retrieve fieldb1 and fieldc1 from source B and C respectively and still be able to search fields from source A.
Fieldb1 and fieldc1 can store distinct values for the same id_field.

Example:
_time id_field1 id_field2 fielda1 fielda2 fieldb1 fieldc1
xx-yy-zzzz 111 444 FA1 FA2 AAA CCC
xx-yy-zzzz 111 444 FA1 FA2 BBB CCC
xx-yy-zzzz 111 444 FA1 FA2 AAA ZZZ

I tried several approaches:
- Used eventstats to add both fieldb1 and fieldc1 to events, which is a slow method.
- Used the trick eval id__{source}=id_field1 | stats values(id__*) as id__* (...) by id_field1 | where id__sourcea=id__sourceb (...) so far this seems to be the best approach but by applying the values to fieldb1 and fieldc1 I'll sometimes have a multivalue field which will lead to use the mvexpand command and make the query even slower. Another issue is that applying the values() will sort the multivalue field alphabetically and I'll loose track of the connection between id_field2 and fieldb1.

Using an automatic lookup won't fit due to the huge number of distinct id's.
I'm also looking at the append function but due to its 50k events limitation I think it won't work.

Could you provide some guidance on how to do this?

Thanks in advance.

0 Karma

deepashri_123
Motivator

Hey LordLeet,

Are you trying something like this:

source=sourceA OR source=sourceB OR source=sourceC | stats values(fielda1) AS fielda1 values(fielda2) AS fielda2 values(fieldb1) AS fieldb1 values(fieldc1) AS fieldc1 values(id_field1) AS id_field1 values(id_field2) AS id_field1 by _time

Let me know if this helps!!

0 Karma

logloganathan
Motivator

Could you please try this query and let us know

source="Source A" | table id_field1 id_field2 fielda1 fielda2 | append [ search source="Source B" | table fieldb1 ] | append [ search source="Source C" | table fieldc1 ] | table id_field1 id_field2 fielda1 fielda2 fieldb1 fieldb1

0 Karma

LordLeet
Path Finder

Hello logloganathan,

Unfortunately append won't solve it since it truncates the results to 50k events and I have much more than that. Due to performance issues I think I shouldn't mess with this value on the configuration files.

0 Karma

logloganathan
Motivator

Could you please provide your full query so that i can modify and provide to you

0 Karma

LordLeet
Path Finder

Lets say source A is the main data source and source B and C only store some metadata.
Source A stores the user and the item_id, Source B stores the item_id and item_shape and Source C stores the item_id and the item_color.

I only need to access B and C to retrieve those fields and then do other operations or aggregations on the main source based on the item color and shape.

This is an example

index=idx (source=sourceB OR source=sourceC)
| chart dc(item_id) as items by user color
| rename 1 as Red 2 as Green 4 as Blue
| eval Purple=Red+Blue
| eval item_specs=case(Green!=0 AND Purple!=0,"Green & Purple",Green!=0 AND Purple=0,"Green Only",Purple>1 AND Green=0,"Purple Only",NULL=0 AND Purple=0 AND Green=0, "Uncolored")
| stats count by hh_type

Green & Purple 6537
Green Only 1755
Uncolored 419669
Purple Only 36577

This works as intended if I just wanted to do the aggregation.
I'm having trouble on how to proceed if I wanted to use more fields than the ones mentioned above since the chart wouldn't work.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...