I am stranded extracting "values" from below xml
<SearchElements>
<entry key="FirstName">%</entry>
<entry key="Gender">MALE</entry>
<entry key="State">VA</entry>
</SearchElements>
I am expecting regex to give me output of values as:
%, MALE, VA
Here is the regex which doesnt work as expected. Please let me know whats going wrong
rex field=abc "(?ms)\<entry key="\w+"\>(?P<abc>[^<]+)<\entry>"
You don't need to match the whole tag - just match everything up to the start of the next one. Also, be careful of your slashes - in your example you have <\entry>
instead of </entry>
, and remember to escape the quotes.
| rex max_match=50 field=abc "(?ms)\<entry key=\"\w+\"\>(?<value>[^\<]+)"
| eval valuelist=mvjoin(value, ", ")
Here's how the regex will be processed:
<entry key="
\w
That's it - you're done! The only reason to keep matching would be if you either had multiple similar formats, or if you needed to capture more fields.
Here's what the Splunk commands are doing:
rex
will repeat the regex processing up to 50 times until all matches are found. Put each match of the named capture group value
into a field named value
.
eval
will then join all of these matches into a single line of text, putting a comma and a space between each match.Learning Regular Expressions
Get a good regex tester like Kodos or RegexBuddy, and take a good look at regular-expressions.info if you need to practice. That's usually easier than trying to debug regexes in the Splunk command line. Also, try to work out exactly what's going on in each of the other examples people have posted -- getting a handle the examples is the key to being able to being able to adapt them to your own needs more quickly.
You don't need to match the whole tag - just match everything up to the start of the next one. Also, be careful of your slashes - in your example you have <\entry>
instead of </entry>
, and remember to escape the quotes.
| rex max_match=50 field=abc "(?ms)\<entry key=\"\w+\"\>(?<value>[^\<]+)"
| eval valuelist=mvjoin(value, ", ")
Here's how the regex will be processed:
<entry key="
\w
That's it - you're done! The only reason to keep matching would be if you either had multiple similar formats, or if you needed to capture more fields.
Here's what the Splunk commands are doing:
rex
will repeat the regex processing up to 50 times until all matches are found. Put each match of the named capture group value
into a field named value
.
eval
will then join all of these matches into a single line of text, putting a comma and a space between each match.Learning Regular Expressions
Get a good regex tester like Kodos or RegexBuddy, and take a good look at regular-expressions.info if you need to practice. That's usually easier than trying to debug regexes in the Splunk command line. Also, try to work out exactly what's going on in each of the other examples people have posted -- getting a handle the examples is the key to being able to being able to adapt them to your own needs more quickly.
Please take a moment to answer the posting on
Thanks for wonderful explaination. Thats really informative