Splunk Search

Splunk Dedup by _time and Combine Values Simultaneously

cameronwt
Engager

I am working with Exchange 2010 data. I have the MessageID, Sender, Recipients, and _time. Depending on the event type, recipients can be split (i.e. all recipients for a given message are not included in the event, but are split across multiple events). Here is an example of the data:

_time    MessageID    Sender    Recipients
4:25      <12345>     Sender1   Recipient1
4:50      <12345>     Sender1   Recipient2

I use this query to combine multiple Recipients values into one event, by MessageID and Sender:

index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
| stats values(Recipients) as recip by MessageID, Sender
| table _time, MessageID, Sender, recip

This results in:

_time     MessageID   Sender    recip
          <12345>     Sender1   Recipient1; Recipient2

_time is not populated, because values wouldn't work if _time was included in the by statement since each event for a particular MessageID happens at a different time. Therefore, I need to somehow dedup MessageID (to get the latest _time) and stuff Recipients values simultaneously.

I attempted this:

| stats max(_time) as datetime by MessageID, Sender
| stats values(Recipients) as recip by datetime, MessageID, Sender
| table datetime, MessageID, Sender, recip

But this doesn't work either, and it results in recip not being populated:

_time    MessageID    Sender    recip
4:50      <12345>     Sender1   

My desired output looks like this:

_time     MessageID   Sender    recip
4:50      <12345>     Sender1   Recipient1; Recipient2

How should I achieve that?

0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

You can just include _time in your aggregation. Say you want to retain _time of the first event for that MessageID, try like this

 index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
 | stats min(_time) as _time values(Recipients) as recip by MessageID, Sender
 | table _time, MessageID, Sender, recip

You can also use max(_time) to get latest event's timestamp.

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

You can just include _time in your aggregation. Say you want to retain _time of the first event for that MessageID, try like this

 index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
 | stats min(_time) as _time values(Recipients) as recip by MessageID, Sender
 | table _time, MessageID, Sender, recip

You can also use max(_time) to get latest event's timestamp.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...