I've got data say in following format
name,department,location,score
jack,finance,houston,220
jill,finance,london,490
jake,finance,paris,200
jude,finance,vegas,600
tom,developer,dubai,350
dave,developer,vegas,800
joe,developer,california,600
john,cto,moscow,700
judy,ceo,washington,750
The output needs ranking of the data grouped by department ; & ONLY the first two Ranks based on score should be displayed
(Also please note, there could be one entry for certain department and in that case only Rank1 needs to be displayed)
Required output is something like:
rank,name,department,location,score
rank1,jude,finance,vegas,600
rank2,jill,finance,london,490
rank1,lizy,support,vegas,900
rank2,bond,support,dubai,600
rank1,john,cto,moscow,700
rank1,judy,ceo,washington,750
Any simple SPL can you suggest?
Like this:
|makeresults
| eval raw="jack,finance,houston,220::jill,finance,london,490::jake,finance,paris,200::jude,finance,vegas,600::tom,developer,dubai,350::dave,developer,vegas,800::joe,developer,california,600::john,cto,moscow,700::judy,ceo,washington,750"
| makemv delim="::" raw
| mvexpand raw
| rename raw AS _raw
| rex "^(?<name>[^,]+),(?<department>[^,]+),(?<location>[^,]+),(?<score>[^,]+)$"
| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"
| sort 0 -score department
| streamstats current=t count AS JuNkT3Mp_rankNoTies BY department
| search JuNkT3Mp_rankNoTies<3
| eventstats min(JuNkT3Mp_rankNoTies) AS JuNkT3Mp_newFieldName BY score department
| fields - JuNkT3Mp_rankNoTies
| rename JuNkT3Mp_newFieldName AS rank
Like this:
|makeresults
| eval raw="jack,finance,houston,220::jill,finance,london,490::jake,finance,paris,200::jude,finance,vegas,600::tom,developer,dubai,350::dave,developer,vegas,800::joe,developer,california,600::john,cto,moscow,700::judy,ceo,washington,750"
| makemv delim="::" raw
| mvexpand raw
| rename raw AS _raw
| rex "^(?<name>[^,]+),(?<department>[^,]+),(?<location>[^,]+),(?<score>[^,]+)$"
| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"
| sort 0 -score department
| streamstats current=t count AS JuNkT3Mp_rankNoTies BY department
| search JuNkT3Mp_rankNoTies<3
| eventstats min(JuNkT3Mp_rankNoTies) AS JuNkT3Mp_newFieldName BY score department
| fields - JuNkT3Mp_rankNoTies
| rename JuNkT3Mp_newFieldName AS rank
All hail the hero. Thank you very much @woodcock. Perfect answer and perfectly put with comments
For those of you looking for a more general ranking, remove the | search JuNkT3Mp_rankNoTies<3
part.
The streamstats
command can do that. This example creates a running count for each department then discards all but the first two in each one.
... | streamstats count as rank by department | where rank <=2 | table rank name department location score
it didn't work as expected as the "score" was not part of the streamstats equation.
One of the virtues of streamstats
is it does not filter out fields so all fields are available downstream even if they aren't part of the streamstats
command.
What results did you get? I get this with your sample data.
rank name department location score
1 judy ceo washington 750
1 john cto moscow 700
1 joe developer california 600
2 dave developer vegas 800
1 jude finance vegas 600
2 jake finance paris 200
i got the above result too, but if you see the result is not taking into account "score". Hence the rank is wrong as it just sorts by department
( I understand the concept of streamstats, but my issue i'm not able to find an option to use, like max or perc )