Splunk Search

How to get _time at median value?

luanvn
Explorer

Hello everyone,

Now, I encountered hard problem that I can't solve for long times. I was also google on many hours but not result. The problem has following:

I wanna get median a value on search. I had that value by:

source=check_request  app="test1"  | rename url as "URL"  | where URL="/ShippingOrder/Import" | stats median(el) as abc by URL

And I received a result following:

URL                      abc
/ShippingOrder/Import   29250

Yes, I got median value = 29250. But I wanna have add more a table that show _time at event happen has abc=29250. I searched following:

source=check_request  app="test1"  | rename url as "URL"  | where URL="/ShippingOrder/Import" | stats median(el) as abc by URL | table URL abc _time

But table _time is blank.

I found that time by:

source=check_request  app="test1"  | rename url as "URL"  | where URL="/ShippingOrder/Import" | table URL _time el

Result:

URL                 el                 _time
/ShippingOrder/Import   29016   2017-09-10 18:08:58
/ShippingOrder/Import   6657    2017-09-10 16:47:58
/ShippingOrder/Import   11656   2017-09-10 16:11:35
/ShippingOrder/Import   23906   2017-09-10 14:46:58
/ShippingOrder/Import   46719   2017-09-10 11:03:56
/ShippingOrder/Import   15016   2017-09-10 16:54:22
/ShippingOrder/Import   29250   2017-09-10 16:46:22
/ShippingOrder/Import   51188   2017-09-10 14:58:22
/ShippingOrder/Import   44000   2017-09-10 14:51:22
/ShippingOrder/Import   12046   2017-09-10 14:42:22
/ShippingOrder/Import   50984   2017-09-10 14:41:22
/ShippingOrder/Import   39735   2017-09-10 14:25:22

And at the time has abc is median value 29250 is 2017-09-10 16:46:22

So, how to I get result following by some search?

URL                     abc               Time
/ShippingOrder/Import   29250   2017-09-10 16:46:22

I appreciate your reply

Tags (2)
1 Solution

cmerriman
Super Champion

try this to see if it works for you:

 source=check_request  app="test1"  | rename url as "URL"  | where URL="/ShippingOrder/Import"  | eventstats median(el) as abc by URL|eval medTime=if(abc=el,_time,null())|stats values(medTime) as _time values(abc) as abc by URL

View solution in original post

DalJeanis
SplunkTrust
SplunkTrust

CORRECTION - In Splunk, the calculation of the aggregate function median() does not match the mathematical definition, so the simpler code provided by @cmerriman is a complete solution. Below is the code for how you would do the real median() if that were correctly calculated.


Mathematically, median is not necessarily an actual number present in the dataset. There could be only one, or there could be more than one, or there could be none. Here's some examples to show why...

 1 2 3 6 9        median is 3, which is present in the data
 1 2 3 3 3 6 9    median is 3, three present in the data
 1 2 2 4 6 9     mathematical median is 3, which is not present in the data.  However, splunk picks 4 instead.

This following code is redundant under current implementation of median() in splunk. Just use @cmerriman's.

That mathematical definition would make it a little funky to get the _time, but it could be done. What we would do, is calculate the DIFFERENCE from the median, and then pass any records that have the lowest difference.

source=check_request app="test1" url="/ShippingOrder/Import" 
| rename url as URL 
| eventstats median(el) as UrlMedianEl by URL
| eval DeltaToMedian= abs(UrlMedianEl - el)
| eventstats min(DeltatoMedian) as minDeltaToMedian by URL
| where DeltaToMedia = minDeltaToMedian 
| table _time URL el DeltaToMedian 

See, wasn't that tricky?


Examples altered - changed 4 and 5 to 6 and 9 respectively to help distinguish median - the "middlest" value - from average.

Also, the whole thing has become moot, because splunk's median() just picks the higher of the two.

cmerriman
Super Champion

just for definition sake, median will grab the middle value of all present variables. not the middle value between min and max.
http://docs.splunk.com/Documentation/SplunkCloud/6.6.1/SearchReference/Aggregatefunctions#median.28X...

   |makeresults|eval data="1,2,3,5,6,7"| makemv data delim=","| mvexpand data | rename data as _raw | kv|stats median(_raw)

using this, you get a median of 5, not 4, because Splunk should pick the higher of the two middle values

technically, if it is an even number, you're supposed to average the two middle numbers.
http://reference.wolfram.com/language/ref/Median.html

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

@cmerriman - Thanks! I didn't expect to have to look up how splunk implemented a standard mathematical term. Post has been updated and annotated, and I'm leaving my redundant brilliance in place for posterity, and as a warning to others.

0 Karma

cmerriman
Super Champion

try this to see if it works for you:

 source=check_request  app="test1"  | rename url as "URL"  | where URL="/ShippingOrder/Import"  | eventstats median(el) as abc by URL|eval medTime=if(abc=el,_time,null())|stats values(medTime) as _time values(abc) as abc by URL

luanvn
Explorer

That's great! It worked. Thanks so much cmerriman!

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

@luanvn, @cmerriman - you need a little more, because of the definition of median when there are even numbers of events. You are only covering two of the three cases, and the third one should occur in roughly 50% of the searches.

0 Karma

luanvn
Explorer

@DalJeanis Thank for considering about that.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...