I'm still going through the myriad of answers relating to this, but as of yet, have not found my answer. I am doing something that I am sure almost everyone does. I have 2 searches from 2 sourcetypes, let's call them A and B. Each have a requestID field to join on. In the A type, I have nothing I want to report on, but that is where I want to put my conditions. B is where the reportable fields are. How can I avoid using the join, which is super slow?
index=app
source=sourceA
host=hostA
sourcetype=A
timeoutSvc=*
|rename requestId AS RequestId
|join RequestId [search
index=app
host=hostB
source=sourceB
sourcetype=B
RequestId=*
]
|stats values(AgentUserName)
It sounds like you want to use a subsearch to use your criteria on the B side to whittle down the total RequestId values being received. After that you want to just use stats to group things by the RequestId. Any other fields you need like AgentUserName, throw it in the stats (here I've just added it as an extra group by field because that generally makes the most sense for categorical fields. The final stats with the values(foo) clause might not be exactly what you want ultimately but it matches your question.
index=app host=hostB source=sourceB sourcetype=B
[ search index=app source=sourceA host=hostA sourcetype=A timeoutSvc=*
| rename requestId AS RequestId
| table RequestId
]
| stats count by AgentUserName RequestId
| stats values(AgentUserName)
Now if you want to carry something through to the end from the A side, like the timeoutSvc values, that's pretty simple. Here we still use the A criteria in the subsearch to whittle down the RequestId's, but then we feed this into a disjunction amounting to A OR B.
index=app ( host=hostB source=sourceB sourcetype=B ) OR ( source=sourceA host=hostA sourcetype=A timeoutSvc=* )
[ search index=app source=sourceA host=hostA sourcetype=A timeoutSvc=*
| rename requestId AS RequestId
| table RequestId
]
| stats count by AgentUserName RequestId timeoutSvc
| stats values(timeoutSvc) values(AgentUserName)
You are not describing a join
but rather using the results of one search to qualify another search. You do this with a subsearch
that expands the search parameters for an outer search like this:
index=app host=hostB source=sourceB sourcetype=B [ search index=app source=sourceA host=hostA sourcetype=A timeoutSvc=* | rename requestId AS RequestId | fields RequestId ] | stats values(AgentUserName)
Based on this clarification:
I need to get the requestID from a narrow conditioned search"B", then use that requestID to pull values from events from another search"A" where those values in A don't exist in B, and report as if I have access to all fields from both searches.
This should do it:
index=app host=hostB source=sourceB sourcetype=B [ search index=app source=sourceA host=hostA sourcetype=A timeoutSvc=* | rename requestId AS RequestId | fields RequestId ] | stats values(AgentUserName) | append [ search index=app source=sourceA host=hostA sourcetype=A timeoutSvc=* | rename requestId AS RequestId ] | stats values(*) AS * BY RequestId | stats values(AgentUserName) by timeoutSvc
What if I wanted to include the timeoutSvc from the "A" search along with the unique fields from the "A" search? anytime I try and do that it find no results.
like | stats values(AgentUserName) by timeoutSvc
In that case, you need a totally different kind of search so you need to ask a different question (and it is a join
function that you desire). The most efficient way to do a join
is to not do a join
. You can do the join
with stats values(*) AS *
and then a different analysis with another stats
like this:
index=app (host=hostB source=sourceB sourcetype=B) OR (source=sourceA host=hostA sourcetype=A) | eval normalizedRID=coalesce(requestId, RequestId) | stats values(*) AS * by normalizedRID | stats values(AgentUserName) by timeoutSvc
Unfortunately this will not work as both searches have a requestID, so the coalesce will always evaluate to a requestId making a search that should only return 10 results from the conditions placed on the "B" search, return all results from both searches.
Basically I want to :
Get the requestID from a narrow conditioned search"B", then use that requestID to pull values from events from another search"A" where those values in A don't exist in B, and report as if I have access to all fields from both searches.