Splunk Search

Timechart: Returning Users

HeinzWaescher
Motivator

Hello,

I want to achieve a timechart with a stat for returning Users, which means the number of unique users who have been active on day n and on day n-1 This should look like this in the end:

07.01.2013 1500

08.01.2013 2000

09.01.2013 2500

...and so on.

The first number is the count of unique users, that have been active on 07.01.2013 and 06.01.2013.

The second number is the count of unique users, that have been active on 08.01.2013 and 07.01.2013.

The third number is the count of unique users, that have been active on 09.01.2013 and 08.01.2013.

and so on...

A hint how to achieve this would be great.

Thanks in advance

Heinz

Tags (1)
0 Karma

HeinzWaescher
Motivator

Hi,

another idea came to my mind:

  • set the time to 00:00 per each day
  • dedup user,_time (I would expect that every user will have only one event per day of activity?!)
  • delta _time per user (I would expect, that this calculates the difference of _time: event time of User A - previous event time of User A
  • after that a filter could be set to keep only events where the new calculated field = 86400
  • timechart dc(user) span=1d

Do you think that is possible to get the needed results? Espacially I'm not sure how to realize the step "delta _time per user".

Thanks in advance

Heinz

0 Karma

HeinzWaescher
Motivator

Hi,

thanks for your answer.

When I run your search, there a are only a very few "activeBothDays" with a count of one. I think this is caused, because it is only counted, when the difference of two subsequent events is exactly 86400 sec. I it does not count, if somebody played 01.01.2013 23:00:00 and 02.02.2013 14:00:00. In this case this player should be counted as returned for day 02.02.2013.

do you have an idea how to adjust the search?

Best

Heinz

0 Karma

HeinzWaescher
Motivator

Hello lguinn,

unfortunately it still doesn't seem to work properly. Only a few rows in the timechart have a count, which is always 1.

0 Karma

lguinn2
Legend

Hi Heinz - that is what the bucket _time span=1d should do - set all the times to 00:00 of that day. So the differences really should be exactly 86400 seconds. Perhaps the calculation I used is faulty. I have edited the answer above, so can you try it again?

0 Karma

lguinn2
Legend

Try this

yoursearchhere
| bucket _time span=1d
| stats count by user _time
| sort _time
| delta _time AS timeDelta p=1
| streamstats last(user) as lastUser current=f
| eval activeBothDays = if( timeDelta==86400 AND user==lastUser,1,0)
| timechart span=1d sum(activeBothDays) as UserCount 

First it generates a list of users that were active for each day. Then it computes the difference in time between subsequent events - and collects the name of the user from the prior event. If the user names match and the events are exactly 1 day apart, then activeBothDays is set to 1. The sum of the activeBothDays field is the number of users who were active on both the current day and the prior day.

I would test this by running

yoursearchhere
| bucket _time span=1d
| stats count by user _time
| sort _time
| delta _time AS timeDelta p=1
| streamstats last(user) as lastUser current=f
| eval activeBothDays = if( timeDelta==86400 AND user==lastUser,1,0)
| table _time user lastUser timeDelta activeBothDays
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...