Deployment Architecture

How to use Splunk to find slight variations in email message_subjects and file_names?

packet_hunter
Contributor

I am hoping to find a way to sift thru loads of emails to find emails with similar subjects or similar attachment names.

Currently I might search by subject or attachment name.

For example,

index=mail sourcetype="mail" 
    [search index=mail sourcetype="mail" message_subject = *<something>*  |stats count by internal_message_id | fields internal_message_id]
    |eval Time=strftime(_time, "%H:%M:%S") | eval Date=strftime(_time, "%A %F") 
    |stats list(message_subject) as subj list(sender) as sender list(recipient) as recp list(file_name) as AttachmentName list(attachment_type) as AttachmentType list(vendor_action) as status values(Time) as Time values(Date) as Date by internal_message_id 

or

 index=mail sourcetype="mail" 
        [search index=mail sourcetype="mail" file_name = *<something>*  |stats count by internal_message_id | fields internal_message_id]
        |eval Time=strftime(_time, "%H:%M:%S") | eval Date=strftime(_time, "%A %F") 
        |stats list(message_subject) as subj list(sender) as sender list(recipient) as recp list(file_name) as AttachmentName list(attachment_type) as AttachmentType list(vendor_action) as status values(Time) as Time values(Date) as Date by internal_message_id 

I am looking to find all variations or patterns of similar emails...
for example
subj = Order-008796, Order-008948, Order-009485, etc.
AttachmentName = Order#00879, Order-008948, Order#009485, etc (extns like .doc are already parsed out natively in the log)

Whats the best way to find similar patterns? Cluster? Any other ideas?

Thank you

0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

There are a few ways to do that, depending on the patterns you want to match. One is to use wildcards in the base search

index=mail sourcetype="mail" message_subject ="Order-*" | ...

or use like

index=mail sourcetype="mail"  | where like(message_subject,"Order-%") | ...

or use regex

index=mail sourcetype="mail" | regex message_subject = "Order-\d{6}" | ...
---
If this reply helps you, Karma would be appreciated.

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

There are a few ways to do that, depending on the patterns you want to match. One is to use wildcards in the base search

index=mail sourcetype="mail" message_subject ="Order-*" | ...

or use like

index=mail sourcetype="mail"  | where like(message_subject,"Order-%") | ...

or use regex

index=mail sourcetype="mail" | regex message_subject = "Order-\d{6}" | ...
---
If this reply helps you, Karma would be appreciated.
0 Karma

packet_hunter
Contributor

Thank you Rich. Before I accept your answer, just wanted to get your opinion on using cluster. When would you typically use cluster?

Thank you

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I haven't used the cluster command, but it could apply in this case. I wonder what you'd get from index=mail sourcetype="mail" | cluster field=message_subject | ...

---
If this reply helps you, Karma would be appreciated.
0 Karma

packet_hunter
Contributor

Thanks for the reply, I was thinking about cluster as more of an automatic check with less manual changes to the query.

I will experiment a bit, and post a new question in a while.

Thank you

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...