Solved: Count timechart by column names

harrychen · ‎11-07-2013

Sample log:

2013-11-01-10:11:34 userName=abc, download=1

2013-11-01-10:11:50 userName=abc, download=1

2013-11-01-10:30:29 userName=def, download=1

2013-11-01-11:11:34 userName=abc, download=1

2013-11-01-12:11:34 userName=xyz, download=1

What I need: find the total minutes that INDIVIDUAL user does NOT download anything over the period of time.

My approach so far:

sourcetype="someScr" download>0 | timechart span=1m count(download) by userName

What I got:

_time	abc	def	xyz
2013-11-01-10:11:00	2	0	0
2013-11-01-10:12:00	0	0	0
2013-11-01-10:13:00	0	0	0
...
2013-11-01-10:30:00	0	1	0
2013-11-01-11:11:00	1	0	0
...
2013-11-01-12:11:00	0	0	1

The goal is to count all the 0 rows for INDIVIDUAL user from the upper table:
Sample results

abc	def	xyz
118	119	119

The result table shows that over 2 hours, abc doesn't download anything for 118 minutes, and 119 minutes for def and xyz.

I would like to do something like:

sourcetype="someScr" download>0 | timechart span=1m count(download) by userName | count (_time) by userName where VALUE=0

I'm stuck on the last step, that is how do I refer to the user and the field value and apply a where clause.
Or is there a better approach.

Thanks!

harrychen · ‎11-08-2013

So I figure it out.

Instead of counting the "0" numbers, I subtract the active minutes from the query total minutes.

Here is a sample query:

Thank you all for your help!

View solution in original post

harrychen · ‎11-08-2013

So I figure it out.

Instead of counting the "0" numbers, I subtract the active minutes from the query total minutes.

Here is a sample query:

Thank you all for your help!

somesoni2 · ‎11-08-2013

You can try this :-

   index=_internal | bucket span=1m _time | stats count by _time|fields - count |eval joinfield=1 | join type=outer max=0 joinfield
[search sourcetype="someScr" download>0 |stats count by userName |fields - count |eval joinfield=1]|fields - joinfield
|join type=outer max=0 _time,userName 
[search sourcetype="someScr" download>0  
|bucket _time span=1m | stats count by _time,userName ] 
| eval count=COALESCE(count,0) |where count=0 |stats count by userName

SEARCH UPDATED

This should work now. In the first section I am taking all the _time value with 1 min interval (assuming you have access to _internal index which generally have data every minute). I am then joining that with unique list of userName which gives me _time and userName for every minute and every userName. Rest is same.

Explaination:
first portion of search (before join) will give a table with all the minutes (for selected timerange) and userName combination. So for 60m period and 3 users, this should give 180 events.
the join inner query gives list of user and minute combination for which download > 0 (we don't worry about the count).
This is joined left outer with first portion to give list of all minute and userName with corresponding count of events with download >0. For minutes where there was no download >0 (minutes for which there are not corresponding events in join subquery), the count will be NULL which will be converted to 0. Finally filter all rows with count =0 and get a stats of that.
The format of final table will be different from what you are expecting.

harrychen · ‎11-08-2013

I tried your query. Not working so far.
I think the bucket command will eliminate the minutes that all users don't have downloads. In my example, it will not list the rows from 2013-11-01-10:12:00 to 2013-11-01-10:29:00 and from 2013-11-01-11:12:00 to 2013-11-01-12:10:00.

It will just be

_time abc def xyz
2013-11-01-10:11:00 2 0 0
2013-11-01-10:30:00 0 1 0
2013-11-01-11:11:00 1 0 0
2013-11-01-12:11:00 0 0 1

thoughts?

jtrucks · ‎11-07-2013

Read docs on eventstats and perhaps this will work:

sourcetype="someScr" download>0 | timechart span=1m count(download) by userName | eventstats sum | head 1 | fields - _time

You might have to fiddle with it a tad, but on a very basic test, this got me a similar result as you are looking for. However, I don't have a dataset with more than one value for a count for an exact replica. This should get you most of the way there, though!

--
Jesse Trucks
Minister of Magic

harrychen · ‎11-08-2013

That is not correct. Maybe I didn't make my question clear. I have edited it.

Count timechart by column names

SEARCH UPDATED

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor