Splunk Dev

Keeping highest 99% of values for each field value

jrnastase
Explorer

Hello all. I have calculated measures of a given statistic for a variety of values for the field "Link", and I need to keep the top 99% of values for each Link, and then find the average/minimum of what is left over. Any idea how to do that?

I tried sorting, then dedup X Link to return the top X values, but the problem is each link has a different number of points. Any help would be greatly appreciated!

Tags (1)
0 Karma

DalJeanis
Legend

Yes, you've got it precisely. It's not possible to eliminate the bottom 1% without passing the file, so eventstats is required. Then you have to pass the file again to get the new average.

In other contexts, you can look at outlier for a one-step cleaning command that defaults to get move inward everything that is outside of 2.5x the interquartile range.

jrnastase
Explorer

Not sure if the most efficient solution but here's what I have so far that seems to work...

| eventstats perc01(statistic) as statistic_01p BY Field
| where statistic >= statistic_01p
| stats avg(statistic), min(statistic) BY Field

0 Karma
Get Updates on the Splunk Community!

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...

New Articles from Academic Learning Partners, Help Expand Lantern’s Use Case Library, ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Your Guide to SPL2 at .conf24!

So, you’re headed to .conf24? You’re in for a good time. Las Vegas weather is just *chef’s kiss* beautiful in ...