Splunk Search

How to write regex to extract multiple email addresses into one field where each email is followed by an IP address?

gsawyer1
Engager

I want to extract a field that has multiple email addresses, each one followed by an IP address, all of which appear at the very end of a MS Windows event. My ultimate goal is to capture all of the email addresses in one field, up to the end of the event, but then remove the IP addresses, so I am left with just the email addresses. My regexes so far will not capture beyond the first "Authorized Recipient" email address, and there could a hundred or more recipient addresses listed, depending on the size of a distro list.
Here's a sample of the event, with the exact formatting for how Splunk displays it in search results:

               Time
               LogName=etc
               EventCode=etc
               etc etc
               Message=Message Validation Success

              This action was requested by blah.blah@blah.com.    (-no problem extracting these other values to fields)

              Message Subject:

                blah

             Other info:

                blah blah

            blah text blah.

            Authorized Recipients:

                                  blah1.blah@blah.com        (000.000.000.000)     (- this is an IPv4 address)
                                  blah2.blah@blah.com        (000.000.000.000)    
                                  blah3.blah@blah.com        (000.000.000.000)
                                  etc, etc, etc.....

Here's the pertinent portion of my latest regex:
rex "(?m)Authorized\sRecipients:\s+(?P.*)"

...but it only captures the first email address and IP under recipients. I want to capture all of them, regardless how many are listed.
I'm still a regex newbie, but I know the capture should be greedy, up to the end of the event.

Tags (2)
0 Karma

somesoni2
Revered Legend

Try this

your base search  |  rex field=_raw "Recipients:\s+(?P<DataPortion>.*)" | rex field=DataPortion "(?P<EMAIL>\S+)\s+\((?P<IPADDRESS>\d+\.\d+\.\d+\.\d+)\)" max_match=0
0 Karma

gsawyer1
Engager

This gives me the same issue again. I think I get what you suggested...perform another regex on the "DataPortion" field after it has been extracted. But all this gave me was the same data - the first address and IP address only. I still think I need to make the regex greedy enough to just capture all of the email addresses, from the first one to any others that follow, to the end of the event - although the regex you suggest will help afterwards, once I have the capture, to strip the IP address off of the "DataPortion" field, which I actually won't need.....

0 Karma

theouhuios
Motivator

Use max_match = 0, which will extract multiple values for a regex expression.

rex field=_raw "Recipients:\s+(?P<EMAIL>\S+)\s+\((?P<IPADDRESS>\d+\.\d+\.\d+\.\d+)\)" max_match=0
0 Karma

gsawyer1
Engager

Could the issue be that the "Recipients: " portion is not repeated more than once? The boundary, after the first recipient, changes from "Recipients: email@blah.com (IPaddress)" to just "email@blah.com (IPaddress)", repeated. Should I capture the first instance, and then look for other instances afterward? This would be conditional, as there may or may not be any additional addressees to follow.

0 Karma

theouhuios
Motivator

Yes, That would be the reason. Try the somesoni2 method. That should work. Or else try this

rex field=_raw "\s+(?P<Email>\S+@\S+)\s+\((?P<IPADDRESS>\d+\.\d+\.\d+\.\d+)\)"
0 Karma

gsawyer1
Engager

No joy. The regex is still only capturing the first email recipient as the EMAIL field, even though I'm sending to multiple addresses.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...