Splunk Search

Splunk Dedup by _time and Combine Values Simultaneously

cameronwt
Engager

I am working with Exchange 2010 data. I have the MessageID, Sender, Recipients, and _time. Depending on the event type, recipients can be split (i.e. all recipients for a given message are not included in the event, but are split across multiple events). Here is an example of the data:

_time    MessageID    Sender    Recipients
4:25      <12345>     Sender1   Recipient1
4:50      <12345>     Sender1   Recipient2

I use this query to combine multiple Recipients values into one event, by MessageID and Sender:

index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
| stats values(Recipients) as recip by MessageID, Sender
| table _time, MessageID, Sender, recip

This results in:

_time     MessageID   Sender    recip
          <12345>     Sender1   Recipient1; Recipient2

_time is not populated, because values wouldn't work if _time was included in the by statement since each event for a particular MessageID happens at a different time. Therefore, I need to somehow dedup MessageID (to get the latest _time) and stuff Recipients values simultaneously.

I attempted this:

| stats max(_time) as datetime by MessageID, Sender
| stats values(Recipients) as recip by datetime, MessageID, Sender
| table datetime, MessageID, Sender, recip

But this doesn't work either, and it results in recip not being populated:

_time    MessageID    Sender    recip
4:50      <12345>     Sender1   

My desired output looks like this:

_time     MessageID   Sender    recip
4:50      <12345>     Sender1   Recipient1; Recipient2

How should I achieve that?

0 Karma
1 Solution

somesoni2
Revered Legend

You can just include _time in your aggregation. Say you want to retain _time of the first event for that MessageID, try like this

 index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
 | stats min(_time) as _time values(Recipients) as recip by MessageID, Sender
 | table _time, MessageID, Sender, recip

You can also use max(_time) to get latest event's timestamp.

View solution in original post

somesoni2
Revered Legend

You can just include _time in your aggregation. Say you want to retain _time of the first event for that MessageID, try like this

 index=ExchangeIndex sourcetype=MSExchange:2010:Message tracking
 | stats min(_time) as _time values(Recipients) as recip by MessageID, Sender
 | table _time, MessageID, Sender, recip

You can also use max(_time) to get latest event's timestamp.

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...