Splunk Search

Join 2 large tstats data sets

btorresgil
Builder

I need to join two large tstats namespaces on multiple fields. For example, I have these two tstats:

| tstats count(dst_ip) AS cdip FROM bad_traffic groupby protocol dst_port dst_ip

and

| tstats count(dst_ip) AS cdipt FROM all_traffic groupby protocol dst_port dst_ip src_ip

I need all src_ip fields from all_traffic namespace where the dst_ip, dst_port, and protocol of the all_traffic entry match a dst_ip, dst_port, protocol combination in the bad_traffic namespace. Effectively this gives a list of all the source ip for traffic that matches bad traffic.

I read through the stats, tstats, and eval manuals, but I'm stuck on how to do this efficiently. I can do it with a join on the two tstats commands above, but the datasets are so large it takes forever. I considered doing a prestat and append on the tstats, but I can't seem to get the desired results this way.

Thanks for any help!

Tags (2)
1 Solution

dart
Splunk Employee
Splunk Employee

I'd do it like this:

| tstats count(dst_ip) AS cdip FROM bad_traffic groupby protocol dst_port dst_ip | tstats append=true count(dst_ip) AS cdipt FROM all_traffic groupby protocol dst_port dst_ip src_ip | stats sum(cdip) as cdip sum(cdipt) as cdipt values(src_ip) as src_ip by protocol dst_port dst_ip | search cdip>0 cdipt>0

does that give you what you want?

View solution in original post

dturnbull_splun
Splunk Employee
Splunk Employee

I'm not sure if my previous answer used to work or was simply incorrect. Tstats does work with union:

| tstats summariesonly=false values(Authentication.tag) as tag, values(Authentication.app) as app, count(eval('Authentication.action'=="failure")) as failure, count(eval('Authentication.action'=="success")) as success from datamodel=Authentication by Authentication.src |union [tstats count from datamodel=Authentication BY Authentication.src ]    

So to rework the example above:

| tstats count(dst_ip) AS cdip FROM bad_traffic groupby protocol dst_port dst_ip | union [tstats  count(dst_ip) AS cdipt FROM all_traffic groupby protocol dst_port dst_ip src_ip] | stats sum(cdip) as cdip sum(cdipt) as cdipt values(src_ip) as src_ip by protocol dst_port dst_ip | search cdip>0 cdipt>0

You might need to make the source datamodel names explicit and rename, but that should be straightforward, either before the union, after it or inside the subsearch.

0 Karma

dart
Splunk Employee
Splunk Employee

I'd do it like this:

| tstats count(dst_ip) AS cdip FROM bad_traffic groupby protocol dst_port dst_ip | tstats append=true count(dst_ip) AS cdipt FROM all_traffic groupby protocol dst_port dst_ip src_ip | stats sum(cdip) as cdip sum(cdipt) as cdipt values(src_ip) as src_ip by protocol dst_port dst_ip | search cdip>0 cdipt>0

does that give you what you want?

landen99
Motivator

I downvoted this post because doesn't work

0 Karma

Sanjai676
Path Finder

Perfect..!! worked like a charm. good logic.

0 Karma

David
Splunk Employee
Splunk Employee

I'm not able to replicate this.. You can't do append=true without prestats, and it doesn't seem like you can sum() a count() prestats clause (e.g., sum(cdip) where cdip = prestats count(dst_ip)). Does this work?

0 Karma

David
Splunk Employee
Splunk Employee

@woodcock because some hero has yet to propose a better answer, I can only assume 😉

woodcock
Esteemed Legend

Sadly, I am not he, but I will followup here if I find him!

0 Karma

David
Splunk Employee
Splunk Employee

Duncan is though! Scroll down but a little and you’ll see the new and improved answer.

0 Karma

woodcock
Esteemed Legend

Why is an answer that cannot even run accepted?

0 Karma

sanjay_shrestha
Contributor

How did you solve this issue?

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...