Splunk Search

Get a count all events by event type that happened 24hrs before a different ending event

metersk
Path Finder

I am trying to get counts of all certain events that happened before a user purchased on our site and so far, I am working with this:

earliest=-1d@d 
(ns=interaction msg=match direction=received gen=m is_p="" rematch=0) OR (ns=interaction msg=yes direction=received gender=m is_p="") OR (ns=interaction msg=eis_reply direction=received gen=m is_p="") OR (ns=interaction msg=single_message direction=received gen=m is_p="" reply=1) OR (ns=interaction msg=single_message direction=received gen=m is_p="" reply="") OR (ns=interaction msg=eis direction=received gen=m is_p="") OR (ns=revenue msg=sale recurring=0 item_number=*m) `money_countries` 
| transaction uid maxspan=1d startswith=(ns="interaction") endswith=(msg="sale") 
| table uid eventcount msg

The ending event is (ns=revenue msg=sale recurring=0 item_number=*m) and I want to get counts for each of these events that happened 24hrs before a user purchased: (ns=interaction msg=match direction=received gen=m is_p="" rematch=0) OR (ns=interaction msg=yes direction=received gen=m is_p="") OR (ns=interaction msg=eis_reply direction=received gen=m is_p="") OR (ns=interaction msg=single_message direction=received gen=m is_p="" reply=1) OR (ns=interaction msg=single_message direction=received gen=m is_p="" reply="") OR (ns=interaction msg=eis direction=received gen=m is_p="")

Should I be using a subsearch or a transaction query, maybe with an event count?

Im also thinking this - Could I use where statements to say that all of the interaction events happened w/in 24hrs of the purchase event? I'm just not sure how to do it on a per user basis with so many events.

1 Solution

sideview
SplunkTrust
SplunkTrust

Well, I played around with your search a bit, and refactored the parens slightly so I could read it more easily. The strategy I recommend is to create a field called purchase_time that represents when the given user did finally purchase, then use streamstats to copy the last value seen for that new field back over every event for the given uid, calling it "nextPurchaseTime". Then we can calculate a delta for all the events in the set, relative to this nextPurchaseTime field and filter by that . uids that didn't have a purchase wont have a nextPurchaseTime, thus no delta, so this step will remove them all as well.

earliest=-7d@d (direction=received gen=m is_p="" (ns=interaction msg=match  rematch=0) OR (ns=interaction msg=yes ) OR (ns=interaction msg=eis_reply ) OR (ns=interaction msg=single_message reply=1) OR (ns=interaction msg=single_message  reply="") OR (ns=interaction msg=eis )) OR (ns=revenue msg=sale recurring=0 item_number=*m) `money_countries` 
| eval purchase_time=if(ns="revenue" AND msg="sale",_time,null())
| eval nsMsg=ns."_".msg 
| streamstats last(purchase_time) as nextPurchaseTime by uid 
| eval delta=nextPurchaseTime-_time 
| search delta<86400 delta>=0

At this point we have all the raw materials to do what you might need, but I'm not positive what that is.

You could tack this onto the above - | stats list(nsMsg) by uid and it would give you a little story of "interaction_reply" and "interaction_match" broken out for each user. Or you could tack on | stats count by nsMsg and get the raw overall incidences of all these other types of events in the funnel (in which case you might want to change the delta>=0 to delta>0 and thus filter out the "revenue_sale" events themselves).

But that should put you on the right track.

UPDATE.

Thanks for clarifying a bit more about what you need in the final report
1) "for every user who made a purchase in the last 7 days, what count of each interaction (so, broken out by nsMsg) did they receive 24 hrs prior to purchasing?". I would tack on

| chart count over uid by nsMsg

That will give you pretty much the exact description you gave. You can visualize as a table or a stacked bar chart depending on how many users there are.

2) "The ideal output would be an aggregation count and dc(uid) of all interaction received 24hrs prior to purchasing- count and dc(uid) so that I can get average interaction per user by interaction type. "

instead of that chart command, tack on

| stats count dc(uid) by nsMsg 

and go from there.

UPDATE2

an egregious and embarassing typo was discovered and fixed inline in the above answer. I had mistyped | eval delta=nextPurchaseTime-_time as | eval delta=purchase_time-_time and there was a long comment thread to troubleshoot this which comments have since been deleted. Sorry about that.

View solution in original post

sideview
SplunkTrust
SplunkTrust

Well, I played around with your search a bit, and refactored the parens slightly so I could read it more easily. The strategy I recommend is to create a field called purchase_time that represents when the given user did finally purchase, then use streamstats to copy the last value seen for that new field back over every event for the given uid, calling it "nextPurchaseTime". Then we can calculate a delta for all the events in the set, relative to this nextPurchaseTime field and filter by that . uids that didn't have a purchase wont have a nextPurchaseTime, thus no delta, so this step will remove them all as well.

earliest=-7d@d (direction=received gen=m is_p="" (ns=interaction msg=match  rematch=0) OR (ns=interaction msg=yes ) OR (ns=interaction msg=eis_reply ) OR (ns=interaction msg=single_message reply=1) OR (ns=interaction msg=single_message  reply="") OR (ns=interaction msg=eis )) OR (ns=revenue msg=sale recurring=0 item_number=*m) `money_countries` 
| eval purchase_time=if(ns="revenue" AND msg="sale",_time,null())
| eval nsMsg=ns."_".msg 
| streamstats last(purchase_time) as nextPurchaseTime by uid 
| eval delta=nextPurchaseTime-_time 
| search delta<86400 delta>=0

At this point we have all the raw materials to do what you might need, but I'm not positive what that is.

You could tack this onto the above - | stats list(nsMsg) by uid and it would give you a little story of "interaction_reply" and "interaction_match" broken out for each user. Or you could tack on | stats count by nsMsg and get the raw overall incidences of all these other types of events in the funnel (in which case you might want to change the delta>=0 to delta>0 and thus filter out the "revenue_sale" events themselves).

But that should put you on the right track.

UPDATE.

Thanks for clarifying a bit more about what you need in the final report
1) "for every user who made a purchase in the last 7 days, what count of each interaction (so, broken out by nsMsg) did they receive 24 hrs prior to purchasing?". I would tack on

| chart count over uid by nsMsg

That will give you pretty much the exact description you gave. You can visualize as a table or a stacked bar chart depending on how many users there are.

2) "The ideal output would be an aggregation count and dc(uid) of all interaction received 24hrs prior to purchasing- count and dc(uid) so that I can get average interaction per user by interaction type. "

instead of that chart command, tack on

| stats count dc(uid) by nsMsg 

and go from there.

UPDATE2

an egregious and embarassing typo was discovered and fixed inline in the above answer. I had mistyped | eval delta=nextPurchaseTime-_time as | eval delta=purchase_time-_time and there was a long comment thread to troubleshoot this which comments have since been deleted. Sorry about that.

metersk
Path Finder

Thanks for the above, this is very helpful. The end data I am trying to get is every interaction event that a user received 24hrs before they made a purchase. I tried stats list(nsMsg) by uid but the only events I am getting returned are revenue_sale. Any thoughts?

0 Karma

sideview
SplunkTrust
SplunkTrust

To clarify "every interaction event that a user did 24hrs before they made a purchase", do you mean that "every user did"?
I also assumed you needed to capture and report on these events for more than one user at a time. If it's just one user at a time a different approach based on subsearches might be faster.
And if you just want the raw events and not any report or visualization then you don't want to tack anything on the end.

0 Karma

metersk
Path Finder

The precise question to answer is, "for every user who made a purchase in the last 7 days, what count of each interaction (so, broken out by nsMsg) did they receive 24 hrs prior to purchasing?"

The ideal output would be an aggregation count and dc(uid) of all interaction received 24hrs prior to purchasing- count and dc(uid) so that I can get average interaction per user by interaction type.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...