Splunk Search

Splunk Dedup by _time and Combine Values Simultaneously

cameronwt
Engager

I am working with Exchange 2010 data. I have the MessageID, Sender, Recipients, and _time. Depending on the event type, recipients can be split (i.e. all recipients for a given message are not included in the event, but are split across multiple events). Here is an example of the data:

_time    MessageID    Sender    Recipients
4:25      <12345>     Sender1   Recipient1
4:50      <12345>     Sender1   Recipient2

I use this query to combine multiple Recipients values into one event, by MessageID and Sender:

index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
| stats values(Recipients) as recip by MessageID, Sender
| table _time, MessageID, Sender, recip

This results in:

_time     MessageID   Sender    recip
          <12345>     Sender1   Recipient1; Recipient2

_time is not populated, because values wouldn't work if _time was included in the by statement since each event for a particular MessageID happens at a different time. Therefore, I need to somehow dedup MessageID (to get the latest _time) and stuff Recipients values simultaneously.

I attempted this:

| stats max(_time) as datetime by MessageID, Sender
| stats values(Recipients) as recip by datetime, MessageID, Sender
| table datetime, MessageID, Sender, recip

But this doesn't work either, and it results in recip not being populated:

_time    MessageID    Sender    recip
4:50      <12345>     Sender1   

My desired output looks like this:

_time     MessageID   Sender    recip
4:50      <12345>     Sender1   Recipient1; Recipient2

How should I achieve that?

0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

You can just include _time in your aggregation. Say you want to retain _time of the first event for that MessageID, try like this

 index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
 | stats min(_time) as _time values(Recipients) as recip by MessageID, Sender
 | table _time, MessageID, Sender, recip

You can also use max(_time) to get latest event's timestamp.

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

You can just include _time in your aggregation. Say you want to retain _time of the first event for that MessageID, try like this

 index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
 | stats min(_time) as _time values(Recipients) as recip by MessageID, Sender
 | table _time, MessageID, Sender, recip

You can also use max(_time) to get latest event's timestamp.

Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...