Splunk Search

Averaging Transaction Length by Operation, Excluding Outliers

Adam
Explorer

I can get the average duration of each of our operations with the following query:

* | transaction transactionId | stats avg(duration) as avgDuration by operation 

But I'd like to exclude outliers (we log asynchronously, so occasionally we end up with a message that gets logged a few minutes after the transaction ends).

It seems I can do

* | transaction transactionId | outlier duration | stats avg(duration) as averageDurationExcludingOutliers by operation

but this excludes outliers before taking the average, and we have some operations that are inherently much longer than others. I'd really like to exclude outliers within each operation.

I've tried

* | transaction transactionId | stats values(duration) as v by operation | outlier v | stats avg(v) as averageDurationExcludingOutliers by operation

but this seems to have no effect (results are the same as the first query).

Any ideas?

0 Karma
1 Solution

cphair
Builder

Assuming you're only concerned about overly long operations, one option is to eliminate all results above the Xth percentile:


... | transaction transactionId | eventstats perc90(duration) as Perc90 by operation | where duration<Perc90 | stats avg(duration) as avgdur by operation

Another option is to use the median rather than the average, which won't be as skewed by outlying values:


... | transaction transactionId | stats median(duration) as avgdur by operation

According to the docs, using outlier with no arguments truncates the outlying value to the outlier "limit", so the number you get back might not reflect any actual search you ran. Using action=rm to remove did odd stuff on my test data (deleted the duration field but not the split-by field), so I don't know if it will do the right thing on already transformed data. Let meknow if either of those work for you.

View solution in original post

cphair
Builder

Assuming you're only concerned about overly long operations, one option is to eliminate all results above the Xth percentile:


... | transaction transactionId | eventstats perc90(duration) as Perc90 by operation | where duration<Perc90 | stats avg(duration) as avgdur by operation

Another option is to use the median rather than the average, which won't be as skewed by outlying values:


... | transaction transactionId | stats median(duration) as avgdur by operation

According to the docs, using outlier with no arguments truncates the outlying value to the outlier "limit", so the number you get back might not reflect any actual search you ran. Using action=rm to remove did odd stuff on my test data (deleted the duration field but not the split-by field), so I don't know if it will do the right thing on already transformed data. Let meknow if either of those work for you.

Adam
Explorer

I'll have to play with the first one some, but using median never even occurred to me... And it should work fine for my needs.

Thanks!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...