I am filtering events in transforms.conf
but I cannot seem to get the regex to match. When I test the regex in Search it works as expected and even when tested at http://gskinner.com/RegExr/.
I'm trying to match on the MsgType tag.
Sample event:
2013-10-28 4:36:38,322 <?xml version="1.0" encoding="UTF-8"?><INTERFACE><MsgType>SendMessage</MsgType><Emailaddress>user@example.com</Emailaddress><Userid>9999999999999</Userid><FolderName>inbox</FolderName><Alerts>false</Alerts><Ack>true</Ack><To>user@example.com</To></INTERFACE>
Below are variations that I tried that all seem to work but not when used in transforms.conf
^(.*<MsgType>(SendMessage|ReplyMessage)\b<\/MsgType>).*$
^(.*<MsgType.(SendMessage|ReplyMessage)\b<\/).*$
^(.*<MsgType.(SendMessage|ReplyMessage)\b<.MsgType.).*$
^(.*MsgType.(SendMessage|ReplyMessage)\b..MsgType).*$
^(.*<[^<]*MsgType[^>]*>(SendMessage|ReplyMessage)\b<\/[^<\/]*MsgType[^>]*>).*$
This works but isn't ideal ^(.*MsgType.(SendMessage|ReplyMessage)\b).*$
What's the proper way to escape the opening/closing tags?
First of all there's no need for anchor your matches with ^.*
and .*$
. The regex engine will automatically find what you're after anyway. You don't need to escape either of the characters you're escaping.
<MsgType>(SendMessage|ReplyMessage)</MsgType>
should work just fine.
First of all there's no need for anchor your matches with ^.*
and .*$
. The regex engine will automatically find what you're after anyway. You don't need to escape either of the characters you're escaping.
<MsgType>(SendMessage|ReplyMessage)</MsgType>
should work just fine.
It looks like my issue was due to the fact that SED-*
entries are executed prior to TRANSFORMS-*
As a follow up, running certain sed scripts seem to work without issue while others cause the event to never get indexed. For example, running SEDCMD-format= s/Emailaddress/Email/g
after TRANSFORMS-set= setnull,keep
in props.conf
works but SEDCMD-format= s/(.*)<MsgType>(.*)<\/MsgType>.*/\1 MsgType=\2/
does not and the event is never indexed. Any ideas?
Thank you. You are correct and this does work just fine. It seems that a sed script running after the transforms was the issue. I thought it was the regex that was the problem.