I would like to dedup a series of events and save the oldest event for each host. Is it possible to use dedup for that? From what I gather, dedup will use the _time field to decide which events to keep, keeping the most recent. Is it possible to make dedup grab older events or force it to use a different field to decide which events to keep?
Thx.
Another way could be using stats, which I believe will be little faster than dedup:
* | stats last(*) as *, last(_*) as _* by host
The only this here is that it will shows the results as table (which in most cases desirable).
If data is huge, it will take long time, and also the default limit is 200MB, if it crosses in the evaluation then you might end up in improper results.
If you are ok, with time and memory consumption.
You should be able to use the sortby +_time field to make that happen.
* | dedup MyField sortby +_time
For me, when I searched and verified, I got the earliest event
I downvoted this post because sort by is applied to the results after deduping
Hi macrosgarcia
Downvoting should only be reserved for suggestions/solutions that could be potentially harmful to a Splunk environment or goes completely against known best practices. Simply commenting with constructive feedback on the post you are concerned with will be more beneficial for the community to learn from.
Some of the most active members in Answers have helped set the standard of how voting etiquette should work in the Splunk community which distinguishes our culture apart from other Q&A forums. Upvote early and often to give credit where it’s due for high-quality posts, comment where you think feedback needs to be given, and only downvote if something potentially dangerous is suggested. If you’re interested in seeing how this voting etiquette was developed, check out this Splunk Answers post: https://answers.splunk.com/answers/244111/proper-etiquette-and-timing-for-voting-here-on-ans.html
Does this work? The docs say that sort by is applied to the results after deduping. However, if the docs are correct, making sort by part of dedup doesn't make sense.