I believe the following two queries are essentially the same but when I run them the former returns results with 10s more events per week then the other. Anyone have an idea on why the small variations?
index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats count by _time, logged_user_id | stats sum(count) AS UniqueUsersSeen by _time
index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats dc(logged_user_id) AS UniqueUsersSeen by _time
For example running the following...
index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats dc(logged_user_id) AS UniqueUsersSeen by _time | appendcols [search index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats count by _time, logged_user_id | stats sum(count) AS UniqueUsersSeen2 by _time]
Gets me...
_time UniqueUsersSeen UniqueUsersSeen2
9/5/2013 9 9
9/12/2013 39475 39501
9/19/2013 45320 45360
9/26/2013 15049 15061
10/3/2013 7573 7580
10/10/2013 4055 4056
10/17/2013 3346 3352
10/24/2013 3169 3170
10/31/2013 2685 2687
11/7/2013 3146 3147
11/14/2013 3734 3737
11/21/2013 3023 3026
11/28/2013 2365 2368
12/5/2013 2450 2454
12/12/2013 5617 5617
12/19/2013 11907 11912
12/26/2013 10782 10782
1/2/2014 8163 8165
1/9/2014 5078 5080
1/16/2014 3014 3016
1/23/2014 2400 2402
1/30/2014 1965 1967
2/6/2014 1732 1748
2/13/2014 1587 1588
2/20/2014 1499 1542
2/27/2014 1574 1691
3/6/2014 1502 1588
3/13/2014 1599 1735
3/20/2014 2034 2239
3/27/2014 1211 1283
These two searches are not identical, because the first one looks at the count of events by userid and time whereas the other looks at the distinct count of users by time. If all users occurred in only one event each this would be the same, but if any user occurs in more than one event it won't - and it seems likely this is what is happening here that makes you see a difference.
First query gives you NoOfUserLogins (not UniqueUsersSeen)
These two searches are not identical, because the first one looks at the count of events by userid and time whereas the other looks at the distinct count of users by time. If all users occurred in only one event each this would be the same, but if any user occurs in more than one event it won't - and it seems likely this is what is happening here that makes you see a difference.
Oops yeah the following would be equivalent...
index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats count by _time, logged_user_id | stats count by _time | stats sum(count) AS UniqueUsersSeen by _time
...and...
index=myindex build_type=prod event_type="Setup complete" | bucket _time span=1week | stats dc(logged_user_id) AS UniqueUsersSeen by _time
...knew it was obvious once I stepped back from it!
Thanks folks