Splunk Search

How to extract the email address from the my logs at either search or index-time?

smudge797
Path Finder

I need to extract the email address from the following logs, either in a search or via props.conf - transforms.conf Any help much appreciated

Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 RID 0 To:
Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 From:
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842704 Message-ID '54dae016a65c48b1804eddd51059c847@DBCXEXCHMBX002.my.domain.in.com'
Sep 28 20:59:57 10.123.78.15/10.123.78.15 Mailrcc2_Splunk_Syslog_Push: Info: MID 20248631 ICID 58528527 RID 0 To:
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ICID 57360164 RID 0 To:
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ready 1581 bytes from
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: New SMTP ICID 57360165 interface InternalNet (10.123.78.14) address 10.123.245.103 reverse dns host unknown verified no

Thanks!

smudge797
Path Finder

The emails inside the < > were removed from my post, So removing them shows all the emails, ideally I need the To: From: as searchable fields and discoverable in the discovered fields panel:

Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 RID 0 To: me.rongan@gmail.com
Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 From: susmith@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842704 Message-ID '54dae016a65c48b1804eddd51059c847@DBCXEXCHMBX002.my.domain.in.com'
Sep 28 20:59:57 10.123.78.15/10.123.78.15 Mailrcc2_Splunk_Syslog_Push: Info: MID 20248631 ICID 58528527 RID 0 To: oliver.whatmail@orange-blah.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ICID 57360164 RID 0 To: bnagpalxxx@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ready 1581 bytes from prvs=341af728e=admin@whatthedom.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: New SMTP ICID 57360165 interface InternalNet (10.123.78.14) address 10.123.245.103 reverse dns host unknown verified no,Looks like my sample text had the emails removed that were inside the < > by removing the < > you can see the emails. Is there a way to have the To & From as searchable fields?

Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 RID 0 To: blah@gah.com
Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 From: susmith@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842704 Message-ID '<54dae016a65c48b1804eddd51059c847@DBCXEXCHMBX002.my.domain.in.com'
Sep 28 20:59:57 10.123.78.15/10.123.78.15 Mailrcc2_Splunk_Syslog_Push: Info: MID 20248631 ICID 58528527 RID 0 To: oliver.whatmail@orange-blah.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ICID 57360164 RID 0 To: bnagpalxxx@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ready 1581 bytes from prvs=341af728e=admin@whatthedom.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: New SMTP ICID 57360165 interface InternalNet (10.123.78.14) address 10.123.245.103 reverse dns host unknown verified no

0 Karma

kristian_kolb
Ultra Champion

You can use the 'code sample' formatting option when you have 'special' characters in your posts. It's the little button with "101010". Also, I updated my answer above.

/k

0 Karma

kristian_kolb
Ultra Champion

With rex (searchtime) it's as easy as;

your search |  rex "<(?<myemail>[^>]+)" |  blah blah 

In props.conf (also searchtime)

[your sourcetype]
EXTRACT-blah = <(?<myemail>[^>]+)

Don't try to do it at index-time. That is not what you want.


UPDATE:

To extract email addresses into field names based on their context inside an event, you might want to try something like;

props.conf

[your sourcetype]
EXTRACT-to = \sTo:\s(?<to_addr>\S+)
EXTRACT-from = \From:\s(?<from_addr>\S+)

This should work for the to/from email addresses of the formats below.

Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 RID 0 To: me.rongan@gmail.com
Sep 28 20:59:57 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842705 ICID 57360165 From: susmith@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842704 Message-ID '<54dae016a65c48b1804eddd51059c847@DBCXEXCHMBX002.my.domain.in.com>'
Sep 28 20:59:57 10.123.78.15/10.123.78.15 Mailrcc2_Splunk_Syslog_Push: Info: MID 20248631 ICID 58528527 RID 0 To: oliver.whatmail@orange-blah.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ICID 57360164 RID 0 To: bnagpalxxx@expand.com
Sep 28 20:59:56 10.123.78.14/10.123.78.14 Mailrcc1_Splunk_Syslog_Push: Info: MID 20842703 ready 1581 bytes from prvs=341af728e=admin@whatthedom.com

If you also want to extract the other addresses, you could add the following;

EXTRACT-other = [=<](?<other_addr>[a-zA-Z0-9-.]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.])

Should work

/k

smudge797
Path Finder

In props.conf
Using EXTRACT-to = \sTo:\s(?\S+) is working well 🙂
Using EXTRACT-from = \From:\s(?\S+) is showing as | from_addr=<> when searching?

Some are working but majority are showing bank.

0 Karma

ulrich_track
Path Finder

I would use this regex:
<([0-9A-Za-z.]+)(@)([0-9A-Za-z.]+)>

Tested it with regex101.com

0 Karma

ulrich_track
Path Finder

True - I forgot - in Splunk you do not search for the string, you search for what is before and after it.

0 Karma

kristian_kolb
Ultra Champion

That extracts three separate strings, and it does not put them into a field.

markthompson
Builder

You might be able to use the transaction and use transaction startswith="<" endswith=">".

0 Karma

kristian_kolb
Ultra Champion

No, it does not work that way. A transaction is a method for grouping separate events together, based on some characteristic, such as a common field value.

Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...