I've been battling this, and I'm not sure if it's a bug in Splunk or what. This is for a field extraction.
I simply need to grab all text between the following character strings and assign a field name.
Here is an example event snippet:
Exception=12567 - INSURANCE_BOOKING - Sorry we are unable to cancel your Insurance as your coverage has already started, please refer to our Terms and conditions for cancellation policies. - aa5f6710-baa5-49c1-8efa-96c3b13a4cbf
I need to capture everything between Exception=
and \n
OR . - GUID
OR :
Like this:
... | rex "(?ms)Exception=(?<MyCapture>.[^\r\n:]+?)(?:[\r\n]|:|\.?\s+-\s+\w{8}-\w{4}-\w{4}-\w{4}-\w{12}|$)"
Like this:
... | rex "(?ms)Exception=(?<MyCapture>.[^\r\n:]+?)(?:[\r\n]|:|\.?\s+-\s+\w{8}-\w{4}-\w{4}-\w{4}-\w{12}|$)"
This is awesome, thanks! I can use this to deconstruct the syntax for other variables. I was working from a lot of documentation on regex, and I swear was doing things as documented and having crap luck. I really need to sit down and take an in depth refresher on regex.
This seems close but still contains the GUIDS
Show me non-conforming data and I can adjust.
Exception=BAD_EXTERNAL_DATA - VOYAGER - Los datos indicados por el sistema externo no son los esperados - aa39147e-2cdb-47d8-a167-7175eff6496a
You said OR . - GUID
and this example does not have a period. I made the period optional and updated my original answer. It should work for both cases now.
Try something like this
your base search| rex field=_raw "Exception=(?<Message>.+)(\n|:|\.\s+-\s\w{8}-\w{4}-\w{4}-\w{4}-\w{12})"
Run anywhere sample with all three cases
| gentimes start=-1 | eval _raw="Exception=12567 - INSURANCE_BOOKING - Sorry we are unable to cancel your Insurance as your coverage has already started, please refer to our Terms and conditions for cancellation policies. - aa5f6710-baa5-49c1-8efa-96c3b13a4cbf" | table _raw | append [| gentimes start=-1 | eval _raw="Exception=12567 - INSURANCE_BOOKING - Sorry we are unable to cancel your Insurance as your coverage has already started, please refer to our Terms and conditions for cancellation policies
dfd. - aa5f6710-baa5-49c1-8efa-96c3b13a4cbf" | table _raw ]| append [| gentimes start=-1 | eval _raw="Exception=12567 - INSURANCE_BOOKING - Sorry we are unable to cancel your Insurance as your coverage has already started, please refer to our Terms and conditions for cancellation policies: additional text for test" | table _raw]| rex field=_raw "Exception=(?<Message>.+)(\n|:|\.\s+-\s\w{8}-\w{4}-\w{4}-\w{4}-\w{12})"
How would this look in a field extraction transform? It does not seem to work when declared
(?i)Exception=(?.+)(\n|:|.\s+-\s\w{8}-\w{4}-\w{4}-\w{4}-\w{12})
Not sure if you'd need a transform.conf for this. You just put it in props.conf as EXTRACT
[yoursourcetype]
EXTRACT-message = Exception=(?<Message>.+)(\n|:|\.\s+-\s\w{8}-\w{4}-\w{4}-\w{4}-\w{12})
OR from Splunk web, Fields-> Fields Extraction
This unfortunately does not break upon reaching any of the end anchors, but rather assigns all text to end of the event to "Message"
Could you try this
EXTRACT-message = Exception=(?<Message>.+)(:|(\.\s+-\s+\w{8}-\w{4}-\w{4}-\w{4}-\w{12})|[\r\n])