Solved: Searching most recent events with the same _time

phoenixdigital · ‎09-24-2012

Ok we are currently receiving two sets of data a preliminary version (received first) and a finalised version (received later). Both sets of data are identical and have the same _time values after import into the same sourcetype.

When performing calculations we only want to get the most recent value for that time.

Prelim data

UID, In Date, Update Time, Vol, Corr Vol
453,May 1 2012 6:00AM,May 2 2012 3:24PM,133,223.000000000
453,May 1 2012 7:00AM,May 2 2012 3:24PM,104,175.000000000
453,May 1 2012 8:00AM,May 2 2012 3:24PM,90,152.000000000

Final data

UID, In Date, Update Time, Vol, Corr Vol
453,May 1 2012 6:00AM,May 2 2012 3:24PM,140,223.000000000
453,May 1 2012 7:00AM,May 2 2012 3:24PM,110,175.000000000
453,May 1 2012 8:00AM,May 2 2012 3:24PM,93,152.000000000

Now I know I can use the search and it will get the most recent version

sourcetype="Flow" UID=452 | dedup _time

Now while this works it is undocumented and we would hate for such a 'feature' to be changed and then break the Splunk app we are developing.

Can someone confirm this is the only way to achieve this or is there a better way?

Ayn · ‎09-24-2012

What is undocumented? dedup _time? While I guess that PARTICULAR usage example for dedup might not be explicitly stated in the docs, both the dedup command and the _time field are definitely not going anywhere soon.

But, I don't know if there's any guarantee that given two events with identical timestamp, Splunk is going to choose the newest one. I would consider differentiating the events using the field it would check anyway to see which event is newer - _indextime, which is what it says...a field containing the time (in epoch format) when Splunk indexed an event.

View solution in original post

Ayn · ‎09-24-2012

What is undocumented? dedup _time? While I guess that PARTICULAR usage example for dedup might not be explicitly stated in the docs, both the dedup command and the _time field are definitely not going anywhere soon.

But, I don't know if there's any guarantee that given two events with identical timestamp, Splunk is going to choose the newest one. I would consider differentiating the events using the field it would check anyway to see which event is newer - _indextime, which is what it says...a field containing the time (in epoch format) when Splunk indexed an event.

phoenixdigital · ‎09-24-2012

Thankyou _indextime would be perfect.

I wasn't thinking dedup was undocumented or would go away but more that the way it behaved with _time might change. That was the undocumented part I was referring to.

sourcetype="Flow" UID = 453 | dedup _time sortby -_indextime

will give consistent results.

Searching most recent events with the same _time

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life