Hi all
For better bounce handling, we're using VERP styled from-addresses when sending mails through our postfix. So when splunk parses the mail logs, I have values in the from-field like this:
from=<bounce+baAABNQIIAAAAAMAAAARZXNEA@newsletter.domain.com>
Now I'm searching for a regex for search time extraction to remove the VERP id (all after the +).
I tried to use a lookahead ?=, which when matches doesn't get added to the whole mach:
\<(?<realfrom>[a-zA-Z]+(?=\+{1}[a-zA-Z]+)@.*)\>
But this didn't work so far. Any ideas how to get rid of the VERP id?
Thanks Simon
The only valid option I can think of is to use the rex command with mode=sed to eliminate this part from the email address:
... | rex field=mail mode=sed "s/\+\w+@/@/g"
The only valid option I can think of is to use the rex command with mode=sed to eliminate this part from the email address:
... | rex field=mail mode=sed "s/\+\w+@/@/g"
Thanks, seems like there's no other possibility.
I also got an answer from support that splunk doesn't replace more than one matching group in transforms.conf
You should be able to do this:
rex "\<(?<realfrom>\S+)\+\w+@"
I tested this with this search on my system:
* | head 1 | eval blah="from=<bounce+baAABNQIIAAAAAMAAAARZXNEA@newsletter.domain.com>" | rex field=blah "\<(?<realfrom>\S+)\+\w+@" | table blah realfrom
Oh ok. In that case ziegfried is right, you'll want to use rex in sed mode.
Hey, thanks for your answer, but it's important that I get the domain name (newsletter.domain.com) in my match too. Only using the username of the email address for identifying senders is not distinct enough.