I have two sourcetypes with data as follows:
First sourcetype:
tx_id=1, event=error, extra=foo
tx_id=1, event=error, extra=bar
tx_id=2, event=info
tx_id=2, event=error, extra=baz
tx_id=3, event=info
Second sourcetype:
tx_id=1, url=/A
tx_id=2, url=/B
tx_id=3, url=/C
What I would like to get is a table with all error
events from the first log, the particular extra
info and the corresponding url
from the second log. That is, (ideally) the result would be
extra=foo, url=/A
extra=bar, url=/A
extra=baz, url=/B
I know SQL fairly well and this would be a plain, simple "join". But I just cannot get this to work in Splunk with a | transaction
(which will not give me the two rows in the tx_id=1
case), a | join
or | stats
.
As this must be really straightforward, I am probably missing the obvious and would appreciate any help or pointers.
Thanks a lot!
Try this:
| makeresults
| eval raw="tx_id=1,event=error,extra=foo,sourcetype=A::tx_id=1,event=error,extra=bar,sourcetype=A::tx_id=2,event=info,sourcetype=A::tx_id=2,event=error,extra=baz,sourcetype=A::tx_id=3,event=info,sourcetype=A::tx_id=1,url=/A,sourcetype=B::tx_id=2,url=/B,sourcetype=B::tx_id=3,url=/C,sourcetype=B"
| makemv delim="::" raw
| mvexpand raw
| rename raw AS _raw
| kv
| fields - _time _raw
| rename COMMENT AS "Everything above generates sample data; everything below is your solution"
| search (sourcetype="A" AND event="error") OR sourcetype="B"
| stats values(*) AS * BY tx_id
| stats values(url) AS url BY extra
Try this:
| makeresults
| eval raw="tx_id=1,event=error,extra=foo,sourcetype=A::tx_id=1,event=error,extra=bar,sourcetype=A::tx_id=2,event=info,sourcetype=A::tx_id=2,event=error,extra=baz,sourcetype=A::tx_id=3,event=info,sourcetype=A::tx_id=1,url=/A,sourcetype=B::tx_id=2,url=/B,sourcetype=B::tx_id=3,url=/C,sourcetype=B"
| makemv delim="::" raw
| mvexpand raw
| rename raw AS _raw
| kv
| fields - _time _raw
| rename COMMENT AS "Everything above generates sample data; everything below is your solution"
| search (sourcetype="A" AND event="error") OR sourcetype="B"
| stats values(*) AS * BY tx_id
| stats values(url) AS url BY extra
That almost did it 😉 but finally helped me to solve it – thanks!
First, I found that the "base" result can be obtained much more efficiently by doing
* [ search sourcetype="A" event="error" | fields tx_id ]
I suspect that's because I have only few "A" type errors, but lots of "B" type events and so Splunk can avoid retrieving the unneeded events in the first place?
Then, | stats values(*) AS * BY tx_id
seems like a feat to me. Am I right that this is somewhat like using |transaction
, in that it "groups" together all events with the same tx_id
, but maintains the different field values? Does that create multi-valued fields per tx_id
?
Last, in fact I had not only one extra
field but two ( extra
and extra2
). I omitted those in the initial question for clarity.
Thus, | stats values(url) AS url BY extra
did not work out, but | stats values(extra), values(extra2) by url
did.
Final search:
* [ search sourcetype="A" event="error" | fields tx_id ]
| stats values(*) AS * BY tx_id
| stats values(extra), values(extra2) by url
Thanks!
Hi
Can you try this search?
sourcetype=first_sourcetype event=error | dedup count by tx_id,extra
append [
search sourcetype=second_sourcetype | dedup tx_id url
] | stats values(extra) as extra values(url) as url by tx_id
Here, I have used dedup
to handle the duplicate event. You can remove it if no duplicate events will come.
Thanks
| dedup count by tx_id, extra
? That seems to yield an empty result. dedup
remove? I would like to keep the duplicate "error" event, maybe I got this wrong...?