Sample log:
2013-11-01-10:11:34 userName=abc, download=1
2013-11-01-10:11:50 userName=abc, download=1
2013-11-01-10:30:29 userName=def, download=1
2013-11-01-11:11:34 userName=abc, download=1
2013-11-01-12:11:34 userName=xyz, download=1
What I need: find the total minutes that INDIVIDUAL user does NOT download anything over the period of time.
My approach so far:
sourcetype="someScr" download>0 | timechart span=1m count(download) by userName
What I got:
_time | abc | def | xyz | |
---|---|---|---|---|
2013-11-01-10:11:00 | 2 | 0 | 0 | |
2013-11-01-10:12:00 | 0 | 0 | 0 | |
2013-11-01-10:13:00 | 0 | 0 | 0 | |
... | ||||
2013-11-01-10:30:00 | 0 | 1 | 0 | |
2013-11-01-11:11:00 | 1 | 0 | 0 | |
... | ||||
2013-11-01-12:11:00 | 0 | 0 | 1 |
The goal is to count all the 0 rows for INDIVIDUAL user from the upper table:
Sample results
abc | def | xyz | |
---|---|---|---|
118 | 119 | 119 |
The result table shows that over 2 hours, abc doesn't download anything for 118 minutes, and 119 minutes for def and xyz.
I would like to do something like:
sourcetype="someScr" download>0 | timechart span=1m count(download) by userName | count (_time) by userName where VALUE=0
I'm stuck on the last step, that is how do I refer to the user and the field value and apply a where clause.
Or is there a better approach.
Thanks!
So I figure it out.
Instead of counting the "0" numbers, I subtract the active minutes from the query total minutes.
Here is a sample query:
sourcetype="dataSrc" download>0 | bucket _time span=1m | stats dc(_time) as "act" by userName | addinfo | eval inact = round((info_max_time - info_min_time)/60 - act, 0) | table userName, act, inact
Thank you all for your help!
So I figure it out.
Instead of counting the "0" numbers, I subtract the active minutes from the query total minutes.
Here is a sample query:
sourcetype="dataSrc" download>0 | bucket _time span=1m | stats dc(_time) as "act" by userName | addinfo | eval inact = round((info_max_time - info_min_time)/60 - act, 0) | table userName, act, inact
Thank you all for your help!
You can try this :-
index=_internal | bucket span=1m _time | stats count by _time|fields - count |eval joinfield=1 | join type=outer max=0 joinfield
[search sourcetype="someScr" download>0 |stats count by userName |fields - count |eval joinfield=1]|fields - joinfield
|join type=outer max=0 _time,userName
[search sourcetype="someScr" download>0
|bucket _time span=1m | stats count by _time,userName ]
| eval count=COALESCE(count,0) |where count=0 |stats count by userName
This should work now. In the first section I am taking all the _time value with 1 min interval (assuming you have access to _internal index which generally have data every minute). I am then joining that with unique list of userName which gives me _time and userName for every minute and every userName. Rest is same.
Explaination:
first portion of search (before join) will give a table with all the minutes (for selected timerange) and userName combination. So for 60m period and 3 users, this should give 180 events.
the join inner query gives list of user and minute combination for which download > 0 (we don't worry about the count).
This is joined left outer with first portion to give list of all minute and userName with corresponding count of events with download >0. For minutes where there was no download >0 (minutes for which there are not corresponding events in join subquery), the count will be NULL which will be converted to 0. Finally filter all rows with count =0 and get a stats of that.
The format of final table will be different from what you are expecting.
I tried your query. Not working so far.
I think the bucket command will eliminate the minutes that all users don't have downloads. In my example, it will not list the rows from 2013-11-01-10:12:00 to 2013-11-01-10:29:00 and from 2013-11-01-11:12:00 to 2013-11-01-12:10:00.
It will just be
_time abc def xyz
2013-11-01-10:11:00 2 0 0
2013-11-01-10:30:00 0 1 0
2013-11-01-11:11:00 1 0 0
2013-11-01-12:11:00 0 0 1
thoughts?
Read docs on eventstats and perhaps this will work:
sourcetype="someScr" download>0 | timechart span=1m count(download) by userName | eventstats sum | head 1 | fields - _time
You might have to fiddle with it a tad, but on a very basic test, this got me a similar result as you are looking for. However, I don't have a dataset with more than one value for a count for an exact replica. This should get you most of the way there, though!
That is not correct. Maybe I didn't make my question clear. I have edited it.