Can you add the details about the splunk search or searches involved? That will help us guide you to the best approach, between using 'stats', 'transaction', 'join' etc...
This topic comes up so regularly that I think someone needs to make a big poster or flowchart. At the top is "I'm pretty sure I need to use some kind of join", and almost all the outcomes involve using stats or transaction, or subsearches. and some tiny slice of the outcomes would actually use splunk's join
command, or it's cousins append
and appendcols
.
The Splunk for SQL users page is a really great resource, but unfortunately it simply says that the equivalent of a SQL join is the splunk join
command and while that's technically true, in a practical sense it doesn't get people where they're trying to go.
OK, forget the join
command for a while. We'll come back to it. You should first look at stats:
(
The important part is the by yourId. The idea is to let Splunk pull the events off disk in a jumble, and then let the stats command sort it all out for you. the 'various operations' can include 'avg', 'max', 'perc95', 'first', 'last', 'values', 'list'.. etc.. (In the case of the multivalued operations like 'list' and 'values', certain splunk search commands like mvexpand
can be useful after the stats. However if you're using them a lot you're usually overthinking something again. )
In some more complex cases, like where there are rex
and eval
in the two searches, it seems like you cannot combine the two searches into a simple disjunction (foo OR Bar). However you usually can do it even in these cases, with a little eval
command and a little searchmatch
. Search for eval and searchmatch on answers and you'll see some interesting cases.
In some cases where one of the search result sets is extremely small, the other is extremely large, and what you want at the end of the day is a small intersection, it's often a good idea to look at subsearches. The idea here is to search for one set of events, get a list of some distinct 'id' values, turn that around and search on it in a giant (id=foo OR id=bar OR id=baz ....) term against some other data.
If you dont think you have fallen into one of the above scenarios, you may actually be one of the use cases that only append
or join
or appendcols
can solve. However honestly you can probably still do it with stats, and if you can do it with stats you should. Searching for large result sets in any kind of subsearch is always going to be a little slow and subject to subtle problems it might take a long time to even see.
There are a lot of great questions and answers on the answers site. (Where you can find me saying the same thing over and over and over again)
http://splunk-base.splunk.com/answers/10377/jointransaction-with-field-comparison-operators
Unfortunately the search functionality on answers returns pretty poor results. The good news is that Google can pull them out for you quite easily:
http://splunk-base.splunk.com/answers/22681/newbie-splunk-equivalent-of-natural-join
-
i'm not a strong sql user, but this might be helpful to you:
http://www.innovato.com/splunk/SQLSplunk.html
not an official Splunk doc, but super useful!
I checked that page. Unfortunately, I can't tell how to access the columns on both sides of the join at the same time in order to subtract the after value from the before value.