Splunk Search

Newbie: Splunk equivalent of NATURAL JOIN

travispowell
Path Finder

I don't really know what to search for on here, but I can't seem to get the | (pipe operator) to work like UNIX.

I have two sets of records. Examples of each type:

A) April 18, session=12345 http_status=404
   April 19, session=78912 http_status=200
   April 20, session=45678 http_status=503

and...

B) April 18, session=12345 ipaddr=89.76.45.34
   April 19, session=78912 ipaddr=44.27.53.78
   April 20, session=45678 ipaddr=79.73.25.39

Here's what I'm trying to get my query to do: get all session ids where http_status >= 400, and then give me the result set of logs where session = x, and ipaddr exists.

Thank you! One example will open up a lot of doors for me with Splunk.

UPDATED: In SQL: SELECT * FROM "B" NATURAL JOIN "A"

Tags (3)
1 Solution

sideview
SplunkTrust
SplunkTrust

Let's say that the events in A have sourcetype="A", and the events in B have sourcetype="B". Maybe in your actual data it'll be source or host or something else but it doesnt matter.

1) If you're looking to stitch everything together on session ids, there are several ways to do this sort of thing. Here are two

(sourcetype=A) OR (sourcetype=B ipaddr=*) | transaction session | search http_status>=400

and sometimes stats is the best tool for the job, although here you need a little mvexpand and it's a bit more complicated:

(sourcetype=A ) OR (sourcetype=B ipaddr=*) | stats count first(ipaddr) as ipaddr values(http_status) as http_status by session | mvexpand http_status | search http_status>=400 | fields ipaddr

2) However depending on how sparse the http_status values over 400 are, you might want to take the approach of using a subsearch. The following will use a subsearch to get all the session id's that had errors, and then it searches on all of those session Id's against sourcetype B. Then finally we just pipe to top to get the top ipaddr values.

sourcetype=B ipaddr=* [sourcetype=A http_status>=400 | fields session] | top ipaddr

View solution in original post

sideview
SplunkTrust
SplunkTrust

Let's say that the events in A have sourcetype="A", and the events in B have sourcetype="B". Maybe in your actual data it'll be source or host or something else but it doesnt matter.

1) If you're looking to stitch everything together on session ids, there are several ways to do this sort of thing. Here are two

(sourcetype=A) OR (sourcetype=B ipaddr=*) | transaction session | search http_status>=400

and sometimes stats is the best tool for the job, although here you need a little mvexpand and it's a bit more complicated:

(sourcetype=A ) OR (sourcetype=B ipaddr=*) | stats count first(ipaddr) as ipaddr values(http_status) as http_status by session | mvexpand http_status | search http_status>=400 | fields ipaddr

2) However depending on how sparse the http_status values over 400 are, you might want to take the approach of using a subsearch. The following will use a subsearch to get all the session id's that had errors, and then it searches on all of those session Id's against sourcetype B. Then finally we just pipe to top to get the top ipaddr values.

sourcetype=B ipaddr=* [sourcetype=A http_status>=400 | fields session] | top ipaddr

travispowell
Path Finder

2 -- Stats worked the best 🙂

0 Karma

travispowell
Path Finder

No problem. Thanks for that. I updated my question so it's clearer for anyone else.

0 Karma

sideview
SplunkTrust
SplunkTrust

I see. sorry I misinterpreted yr question. See updated answer.

0 Karma

travispowell
Path Finder

because those are the two logs that have http_status >= 400 in log type A.

0 Karma

travispowell
Path Finder

Let me explain better.

The above example should return the first and last logs from B:

April 18, session=12345 ipaddr=89.76.45.34

April 20, session=45678 ipaddr=79.73.25.39

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...