I'm taking a shot at providing metrics on day 1 retention numbers of users in our system (Create a profile and the next day use server again).
I can grab a list of users and the _time they created their profile easily:
sourcetype=Profile event=profile__create | table user _time
I have a general server_log response event that will capture user activity:
sourcetype=server_log event=RESPONSE | bucket _time span=1d | stats values(user) by _time
So, how can I take the profile creation events, look forward a day of the event and see if there was activity by that user?
I've been working in this direction with a subsearch, but have unable to complete the query...
sourcetype=server_log event=RESPONSE [search sourcetype=Profile event=profile__create | table user _time] | bucket _time span=1d | stats values(user) by _time
Any advice/direction is greatly appreciated.
Sorry, I realize this was poorly worded.
My expected result is to get the % of users who come back to our site the next day after creating a profile. Daily/Weekly/Monthly granularity.
EX:
User comes to the site on Monday, Jan. 1st and creates a profile.
This fires a profile_create event with the user's id in the field __user_.
The next day, we see a response event in the server_logs, so this user was retained on day 1.
If one hundred people create profiles in a day/week/month, what is the % of users that come back on day 1? (timechart or stats)
Thanks again for your input!
Worked out a solution, with big thanks to somesoni2 and sundareshr for their help.
sourcetype=Profile event=profile__create | eval mytime=strftime(_time, "%Y%m%d") | eval _time= strptime(mytime, "%Y%m%d") | stats count by _time | rename count as total_count
| join type=left _time [search sourcetype=Profile event=profile__create | eval mytime=strftime(_time, "%Y%m%d") | eval _time= strptime(mytime, "%Y%m%d") | fields _time user
| join type=inner _time user [search sourcetype=server_log event=RESPONSE | eval mytime=strftime(_time, "%Y%m%d") | eval _time= strptime(mytime, "%Y%m%d")-86400 | dedup _time user | fields _time user] | stats count by _time | rename count AS "day1"]
| eval perc=day1/total_count | bucket _time span=$granularity$ | fillnull day1 perc value=0 | stats sum(total_count) AS "Total Profiles Created" sum(day1) AS "User's that returned on day 1" avg(perc) AS "% of Users that returned on day 1" by _time
Couldn't find the best way to format it for readability, so I apologize for that mess. Basically,
1. I search and get all profile create events and reconfigure the _time attribute to snap to the basic date
2. I left join on _time with a big subsearch that will return only events that match (inner join) the user and _time of a profile creation AND a server response that gets booted back to the previous day (the day the profile was created).
3. Now i have a table of _time by dates, total_count of created profiles, and count of events that match activity the day after the user created a profile
4. I bucket this time on a span that can be configured (1d,7d,1mon) in a dashboard, fill null counts with 0, and table things out in stats
Probably unconventional and unoptimized. for a last 30 day search, it takes about 10 - 15 seconds for the query to crunch. I'm sure i'll revisit this at a later date to refactor, but right now it fits my needs.
Give this a try as well (may perform better, but check if the results are same as your final query)
sourcetype=Profile event=profile__create | bucket span=1d _time | eventstats count as total_count by _time | fields _time user total_count
| join type=inner _time user [search sourcetype=server_log event=RESPONSE | bucket span=1d _time | dedup _time user | fields _time user event | eval _time=relative_time(_time,"-1d") ]
| eventstats count(eval(event="RESPONSE")) a day1 by _time | table _time user total_count day1
| eval perc=day1/total_count | bucket _time span=$granularity$ | fillnull day1 perc value=0 | stats sum(total_count) AS "Total Profiles Created" sum(day1) AS "User's that returned on day 1" avg(perc) AS "% of Users that returned on day 1" by _time
Not sure If I understood the requirement completely. Are you looking for
1) find users created in Day1
2) Go to Day2 and find if those users identified in step 1 have any activity.
If that is correct, try this,
(sourcetype=server_log event=RESPONSE) OR (sourcetype=Profile event=profile__create) |bucket _time span=1d | chart count(eval(sourcetype="server_log")) as Activity count(eval(sourcetype="Profile") as Created by user _time | streamstats current=f window=1 values(Created) as prev by user | eval Result=if(prev>0 AND Activity>0,"Created and activity next day","")
Try something like this
sourcetype=server_log event=RESPONSE [search sourcetype=Profile event=profile__create | head 1 | eval earliest=_time+3600 | latest=now() | table earliest latest]