Splunk Search

Find top n in each group

saumitra
Engager

I have a collection of records in [object_name, execution_time] format. I want to gather top 10 (i.e. first 10 in sorted sequence) execution time values for each object.

I could extract execution times grouped by object name by

index=myindex | stats values(execTime) as MaxTime by objectName

But I could not find way to sort the individual groups and getting top 10 in each of them.

Thanks,

[EDIT]

Looks like I did not explain the question well. Actually I have a database for how much time did the web service take to execute, in different calls. (I called web service as 'object_name' above). There are various web services.
e.g. I have web services w1,w2, .., wn. I have following records
[w1,e11] [w1,e12] [w1, e13] ... [w2,e21] [w2,e22] [w2,e23] .... [wn,en1] [wn,en2] [wn,en3] ....
e11 represents the time w1 took in its first call.

Now I want the following
w1: t11,t12, ...,t1n
w2: t21,t22, ...,t2n
...

t11, t12, t1n are top n for w1. Top n are the highest 'n' values of e11,e12,,e13 ... in sorted sequence.
e.g. If e11,e12,e13,... are ,8,8,8,8,10,7,7,7,9,9 then Top two are 10,9.

Now Splunk top function would not work here because I would give me 8,7 which are the most frequent two. I don't want that.

It's like list of students' scores in each subject. I want n highest scoring students for each subject. Hope that now it's clear.

Tags (3)
0 Karma
1 Solution

chris
Motivator

Like Ayn I don't quite understand what you're after. Maybe you could post some sample events with the desired output format.

Maybe one of those works for you:

index=myindex | table execTime,objectName | sort -execTime |  streamstats count by objectName |  stats  list(eval(if(count<11,execTime,null()))) as MaxTime  by objectName


index=myindex | table execTime,objectName | sort -execTime |  streamstats count by objectName |  where count < 11

View solution in original post

chris
Motivator

Like Ayn I don't quite understand what you're after. Maybe you could post some sample events with the desired output format.

Maybe one of those works for you:

index=myindex | table execTime,objectName | sort -execTime |  streamstats count by objectName |  stats  list(eval(if(count<11,execTime,null()))) as MaxTime  by objectName


index=myindex | table execTime,objectName | sort -execTime |  streamstats count by objectName |  where count < 11

saumitra
Engager

Sorry for delay, After bit tweaking in your last query, following worked for me:
index=myIndex | eval time=execution_time/1000000.0 | table object_name,time | sort -time | streamstats count by object_name | stats list(eval(if(count<11,execution_time,null()))) as Values by object_name
Thanks !!

0 Karma

chris
Motivator

Hm my bad, sorry. I updated the answer. The streamstats part was not correct. This is a search that works on internal splunk data: index=_internal source="metrics.log" largest_size="" | table name,largest_size | sort -largest_size | streamstats count by name | stats list(eval(if(count<11,largest_size,null()))) as Values by name

0 Karma

saumitra
Engager

Tried these, but both gave more than 10 results for every objectName.

0 Karma

Ayn
Legend

Not entirely sure what exact output you're after. top 10 execTime by objectName?

saumitra
Engager

I have edited qn now. Top would not work for me.

0 Karma

saumitra
Engager

Any response, please?

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...