Comments and answers for "Find a distinct count using a priority system."
https://answers.splunk.com/answers/194474/find-a-distinct-count-using-a-priority-system.html
The latest comments and answers for the question "Find a distinct count using a priority system."Answer by acharlieh
https://answers.splunk.com/answering/194835/view.html
In answering your [follow up question](http://answers.splunk.com/comments/194817/view.html) I felt like this had some strange machinations of its own, so it deserved it's own answer. (vote up if it's useful, or if mods can take your follow up and this answer and split them out into a new question, that'd be cool too).
The goal being a timechart of the number of players in a particular tier, making sure to remove them from their last tier when they bump up tiers. First the solution I came up with:
:: your search ::
| streamstats max(result) as tier by player
| dedup player, tier
| eval add = 1
| streamstats current=false last(tier) as remove by player
| appendpipe [search remove=* | rename remove as tier | eval add = -1]
| sort count=0 _time
| streamstats sum(add) as in_tier by tier
| timechart last(in_tier) by tier
| streamstats last(*) as *
So let's break this down.
`| streamstats max(result) as tier by player | dedup player, tier` - Find the player's tier for every record, and limit only to the times the tier changes
`| eval add = 1` - for every result we have now, we want to add one to the tier.
`| streamstats current=false last(tier) as remove by player` - at every step, find the last tier the player was in if any
`| appendpipe [search remove=* | rename remove as tier | eval add = -1] ` - for every previous tier, add a result for the same time, for the previous tier with the add being a decrement instead of an increment.
`| sort count=0 _time` - put the decrementing results in the proper time series order
`| streamstats sum(add) as in_tier by tier` - calculate the number currently in each tier as each record comes in
`| timechart last(in_tier) by tier` - for every time segment, the number currently in the tier is the last value we saw for a number in the tier for any given tier
`| streamstats last(*) as *` - fill in gaps where a a particular time segment has no change for a particular tier
and voila!
![Pretty timechart!!!][1]
Now dear reader, I'll leave it as an exercise to you to figure out the best way to start each tier in your search with a particular non-zero value. (but you probably want to insert events at the same time as or just before the first event or the beginning of your search window with tier and add fields specified before calculating the number in a particular tier).
[1]: /storage/temp/17453-screen+shot+2014-11-24+at+19.01.05.pngTue, 25 Nov 2014 01:30:35 GMTacharliehComment by acharlieh on acharlieh's comment
https://answers.splunk.com/comments/195460/view.html
That's correct. stats is removing the _time field. It'd be quite a bit of a different search, involving likely a dedup, maybe bin as well.Mon, 24 Nov 2014 21:51:03 GMTacharliehComment by JWBailey on JWBailey's answer
https://answers.splunk.com/comments/194817/view.html
Follow up question:
How can I get this data broken down over a time window? This search gives me the entire time window in 1 set of data, if I wanted to break it down by day, or hour for example.
When I try to use timechart I get no results found. I assume this is caused by the stats command not making the _time field accessible anymore?Mon, 24 Nov 2014 21:27:14 GMTJWBaileyComment by JWBailey on JWBailey's comment
https://answers.splunk.com/comments/194803/view.html
Yes, this is exactly it. Thank you.
Sorry for the misrepresentation of my question resulting in a more complex answer. I do appreciate knowing how to do it each way now...Mon, 24 Nov 2014 16:42:39 GMTJWBaileyComment by acharlieh on acharlieh's comment
https://answers.splunk.com/comments/192618/view.html
Updated answer per comment :) it actually seems to become much easier if you already have priority numbers.Mon, 24 Nov 2014 16:21:58 GMTacharliehComment by JWBailey on JWBailey's answer
https://answers.splunk.com/comments/194799/view.html
Yes, this is great!
One other question. I realized my example might not be a perfect representation of what I am trying to do. In my real data, the values of “result” are not text strings. How would it effect this if instead of “win” and “loss” it was numbers? So now result could be a 2, 1, or 0.
It seems like this falls short in the case function. How do I reference the number of twos or ones identified in the chart command?
I know I could use eval and if statements to get a field with text values in it, but I don’t want to add extra work to the search if I don’t need to.Mon, 24 Nov 2014 15:24:56 GMTJWBaileyAnswer by acharlieh
https://answers.splunk.com/answering/192479/view.html
How about this:
:: your search :: | chart count over player by result | eval result=case(Win>=1,"Win",Tie>=1,"Tie",1==1,"Lose") | stats count as num_players by result
Assuming result can only have the values "Win" "Tie" and "Lose" of course. We get the stats for each player, then we figure out which one each counts toward, and then build stats of how many players count to each.
If you also want the count of events with each of these results in addition to the players at each... so far what i've come up with is a bit complicated:
:: your search :: | chart count over player by result | eval result=case(Win>=1,"Win",Tie>=1,"Tie",1==1,"Lose") | appendpipe [stats count as num_players by result | eval keep=1] | appendpipe [where NOT keep=1 | fields - result num_players keep | untable player result count | where count > 0 | eval keep=1] | where keep=1 | stats sum(count) as events first(num_players) as num_players by result
Also just in case you were curious this is what I've been using as `:: your search ::` to try things out:
| gentimes start=-2 increment=1h | rename starttime as _time | fields | eval player="player"+tostring(floor((random()/2147483648)*100)+1) | eval win=floor((random()/2147483648)*3) | eval result= case(win==0,"Lose",win==1,"Tie",win==2,"Win")
----
Update per comment:
Assuming result takes the values 2,1,0 and that 2 means "win", 1 means "tie" and 0 means "loss" (i.e. already priority order). then the simple process still follows basically the same process, except if you are only interested in the priority you don't need the full stats just a simple max:
:: your search :: | stats max(result) as result by player | stats count as num_players by result
Doing the count of events in addition to the players of each becomes much easier as well:
:: your search :: | stats count by player, result | eventstats max(result) as player_result by player | stats sum(count) as events count(eval(player_result==result)) as num_players by resultSat, 22 Nov 2014 00:59:48 GMTacharlieh