Solved: Count timechart by column names

harrychen · ‎11-07-2013

Sample log:

2013-11-01-10:11:34 userName=abc, download=1

2013-11-01-10:11:50 userName=abc, download=1

2013-11-01-10:30:29 userName=def, download=1

2013-11-01-11:11:34 userName=abc, download=1

2013-11-01-12:11:34 userName=xyz, download=1

What I need: find the total minutes that INDIVIDUAL user does NOT download anything over the period of time.

My approach so far:

sourcetype="someScr" download>0 | timechart span=1m count(download) by userName

What I got:

_time	abc	def	xyz
2013-11-01-10:11:00	2	0	0
2013-11-01-10:12:00	0	0	0
2013-11-01-10:13:00	0	0	0
...
2013-11-01-10:30:00	0	1	0
2013-11-01-11:11:00	1	0	0
...
2013-11-01-12:11:00	0	0	1

The goal is to count all the 0 rows for INDIVIDUAL user from the upper table:
Sample results

abc	def	xyz
118	119	119

The result table shows that over 2 hours, abc doesn't download anything for 118 minutes, and 119 minutes for def and xyz.

I would like to do something like:

sourcetype="someScr" download>0 | timechart span=1m count(download) by userName | count (_time) by userName where VALUE=0

I'm stuck on the last step, that is how do I refer to the user and the field value and apply a where clause.
Or is there a better approach.

Thanks!

harrychen · ‎11-08-2013

So I figure it out.

Instead of counting the "0" numbers, I subtract the active minutes from the query total minutes.

Here is a sample query:

Thank you all for your help!

View solution in original post

harrychen · ‎11-08-2013

So I figure it out.

Instead of counting the "0" numbers, I subtract the active minutes from the query total minutes.

Here is a sample query:

Thank you all for your help!

somesoni2 · ‎11-08-2013

You can try this :-

   index=_internal | bucket span=1m _time | stats count by _time|fields - count |eval joinfield=1 | join type=outer max=0 joinfield
[search sourcetype="someScr" download>0 |stats count by userName |fields - count |eval joinfield=1]|fields - joinfield
|join type=outer max=0 _time,userName 
[search sourcetype="someScr" download>0  
|bucket _time span=1m | stats count by _time,userName ] 
| eval count=COALESCE(count,0) |where count=0 |stats count by userName

SEARCH UPDATED

This should work now. In the first section I am taking all the _time value with 1 min interval (assuming you have access to _internal index which generally have data every minute). I am then joining that with unique list of userName which gives me _time and userName for every minute and every userName. Rest is same.

Explaination:
first portion of search (before join) will give a table with all the minutes (for selected timerange) and userName combination. So for 60m period and 3 users, this should give 180 events.
the join inner query gives list of user and minute combination for which download > 0 (we don't worry about the count).
This is joined left outer with first portion to give list of all minute and userName with corresponding count of events with download >0. For minutes where there was no download >0 (minutes for which there are not corresponding events in join subquery), the count will be NULL which will be converted to 0. Finally filter all rows with count =0 and get a stats of that.
The format of final table will be different from what you are expecting.

harrychen · ‎11-08-2013

I tried your query. Not working so far.
I think the bucket command will eliminate the minutes that all users don't have downloads. In my example, it will not list the rows from 2013-11-01-10:12:00 to 2013-11-01-10:29:00 and from 2013-11-01-11:12:00 to 2013-11-01-12:10:00.

It will just be

_time abc def xyz
2013-11-01-10:11:00 2 0 0
2013-11-01-10:30:00 0 1 0
2013-11-01-11:11:00 1 0 0
2013-11-01-12:11:00 0 0 1

thoughts?

jtrucks · ‎11-07-2013

Read docs on eventstats and perhaps this will work:

sourcetype="someScr" download>0 | timechart span=1m count(download) by userName | eventstats sum | head 1 | fields - _time

You might have to fiddle with it a tad, but on a very basic test, this got me a similar result as you are looking for. However, I don't have a dataset with more than one value for a count for an exact replica. This should get you most of the way there, though!

--
Jesse Trucks
Minister of Magic

harrychen · ‎11-08-2013

That is not correct. Maybe I didn't make my question clear. I have edited it.

Count timechart by column names

SEARCH UPDATED

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!