Splunk Search

Can you help me with some problems with matching rex?

replicamask
Explorer

Hey,

so I've been through all the posts here, and on Google, I can find for this, and I imagine it's a stupid mistake I'm making, but I cannot for the life of me nail this down.

I have a Splunk payload which contains the same field twice that I want to extract. Sometimes they will have the same values and other times different, but either would match the expression.

The problem I am encountering is the expression continues to match the text until it encounters the second instance of where it's supposed to stop.

I'm looking for this:

search terms here
| rex "\SSKU\S(?<field2>[a-zA-Z0-9\D]+)\S\SSKU\S"
| table field2

So, it's supposed to match the field for the SKU from the raw text. However, since the SKU is showing at the beginning and again towards the end, it starts getting the SKU from the first time it matches, but instead of ending after it it continues to capture the text and stops after the second match.

I'm looking for a way to make it stop after the first match, and if possible, also list the second match in the table.

So instead of what I currently have:

123123123</SKU>   <Quantity>19</Quantity>   <Message>OK</Message>   <MessageID>2</MessageID>   <SKU>123123123

I would just like the column with the value 123123123 to appear, and as mentioned above, if I can also have the second value there as well that would be unreal!

0 Karma
1 Solution

renjith_nair
Legend

@replicamask ,

Make it as non-greedy by adding ?.

Try this and also if you have only digits always in SKU, then readjust the pattern matching accordingly

<SKU>(?<field2>[a-zA-Z0-9\D]+?)<\/SKU>
---
What goes around comes around. If it helps, hit it with Karma 🙂

View solution in original post

renjith_nair
Legend

@replicamask ,

Make it as non-greedy by adding ?.

Try this and also if you have only digits always in SKU, then readjust the pattern matching accordingly

<SKU>(?<field2>[a-zA-Z0-9\D]+?)<\/SKU>
---
What goes around comes around. If it helps, hit it with Karma 🙂

replicamask
Explorer

That is perfect! worked exactly as needed!

Can i ask, how does adding the ? to the end of the expression inside the parentheses prevent it from running too far like before?

0 Karma

renjith_nair
Legend

Adding a ? on a quantifier (?, * or +) makes it non-greedy. It matches between one and unlimited times, as "few times as possible", expanding as needed (lazy)
You could add max_match=0 if you need multivalue values of the matching,
ie.

|rex field=_raw max_match=0 "<SKU>(?<field2>[a-zA-Z0-9\D]+?)<\/SKU>"

This will display two values in field2

---
What goes around comes around. If it helps, hit it with Karma 🙂
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...