Comments and answers for "count average for sorted 90% of the events (tricky)"
https://answers.splunk.com/answers/22948/count-average-for-sorted-90-of-the-events-tricky.html
The latest comments and answers for the question "count average for sorted 90% of the events (tricky)"Answer by gkanapathy
https://answers.splunk.com/answering/23025/view.html
This isn't that hard to do, but it's very hard to parallelize, i.e., it's very ineffective to map-reduce, compared with averaging below a percentile. May I ask what the mathematical basis for preferring this computation to simply averaging everything below the 90th percentile would be? Especially since they would only have different results when considering artificial and pathological data sets, particularly if you measure your response times with sufficient granularity.
... | sort response_time | eventstats count as ttl | streamstats global=t current=t count as pos | where pos<(0.9*ttl) | stats avg(response_time)Mon, 25 Apr 2011 16:33:10 GMTgkanapathyAnswer by amitsehgal
https://answers.splunk.com/answering/23021/view.html
Thanks for the prompt response. Sorry if I'm not clear. Let's say if i have response times: 1) data= 1,1,3,3,4,4,4,5,6,6,6,9,9,9,9,7,7,7,7,8,8,8,8,2,2 2) Sort data :1,1,2,2,3,3,4,4,4,5,6,6,6,7,7,7,7,8,8,8,8,9,9,9,9 3) Discard Highest 10% of data and keep 90% by volume and not by value. I'm left with 1,1,2,2,3,3,4,4,4,5,6,6,6,7,7,7,7,8,8,8,8,9,9 (CORRECT) 4) take an average of the 90%
Did you see the difference if i use perc90(data) on above series I get '9' and If i will calculate average of series less than value 9 i would get 1,1,2,2,3,3,4,4,4,5,6,6,6,7,7,7,7,8,8,8,8 (WRONG) and would miss out on last two nines. Please compare the series WRONG and CORRECT. I hope i'm able to explain the issue.
Thanks, AmitMon, 25 Apr 2011 15:53:29 GMTamitsehgalAnswer by gkanapathy
https://answers.splunk.com/answering/22977/view.html
If I understand correctly, you calculate the arithmetic mean ("average") of all values that are below the (e.g.) 90th percentile? And it looks like you know how to do that. However, I can't really understand what you then want to do, i.e., what you mean by "sort and then head x percentage"?Sat, 23 Apr 2011 21:43:58 GMTgkanapathyComment by sideview
https://answers.splunk.com/comments/22973/view.html
I'm having difficulty understanding - in your example what does the "average X percentage of total which means may be 1,2,3,4,5,6,6" mean? Maybe add more examples?Sat, 23 Apr 2011 20:01:14 GMTsideview