Splunk Search

Search on XML Multiple Key Attribute Pairs

bansi
Path Finder

I am stranded extracting "values" from below xml

   <SearchElements>
    <entry key="FirstName">%</entry>
    <entry key="Gender">MALE</entry>
    <entry key="State">VA</entry>   
</SearchElements> 

I am expecting regex to give me output of values as:

%, MALE, VA 

Here is the regex which doesnt work as expected. Please let me know whats going wrong

rex field=abc "(?ms)\<entry key="\w+"\>(?P<abc>[^<]+)<\entry>"
Tags (2)
0 Karma
1 Solution

southeringtonp
Motivator

You don't need to match the whole tag - just match everything up to the start of the next one. Also, be careful of your slashes - in your example you have <\entry> instead of </entry>, and remember to escape the quotes.

| rex max_match=50 field=abc "(?ms)\<entry key=\"\w+\"\>(?<value>[^\<]+)"
| eval valuelist=mvjoin(value, ", ")

Here's how the regex will be processed:

  • Look for the exact leading text <entry key="
  • Movepast one or more "word" characters, indicated by \w
  • Move past a quotation mark and a close-bracket
  • Fill the named capture group "value" with one or more characters that are not an open-bracket symbol
  • That's it - you're done! The only reason to keep matching would be if you either had multiple similar formats, or if you needed to capture more fields.

    Here's what the Splunk commands are doing:

  • rex will repeat the regex processing up to 50 times until all matches are found. Put each match of the named capture group value into a field named value.
  • eval will then join all of these matches into a single line of text, putting a comma and a space between each match.
  • Learning Regular Expressions

    Get a good regex tester like Kodos or RegexBuddy, and take a good look at regular-expressions.info if you need to practice. That's usually easier than trying to debug regexes in the Splunk command line. Also, try to work out exactly what's going on in each of the other examples people have posted -- getting a handle the examples is the key to being able to being able to adapt them to your own needs more quickly.

    View solution in original post

    southeringtonp
    Motivator

    You don't need to match the whole tag - just match everything up to the start of the next one. Also, be careful of your slashes - in your example you have <\entry> instead of </entry>, and remember to escape the quotes.

    | rex max_match=50 field=abc "(?ms)\<entry key=\"\w+\"\>(?<value>[^\<]+)"
    | eval valuelist=mvjoin(value, ", ")
    

    Here's how the regex will be processed:

  • Look for the exact leading text <entry key="
  • Movepast one or more "word" characters, indicated by \w
  • Move past a quotation mark and a close-bracket
  • Fill the named capture group "value" with one or more characters that are not an open-bracket symbol
  • That's it - you're done! The only reason to keep matching would be if you either had multiple similar formats, or if you needed to capture more fields.

    Here's what the Splunk commands are doing:

  • rex will repeat the regex processing up to 50 times until all matches are found. Put each match of the named capture group value into a field named value.
  • eval will then join all of these matches into a single line of text, putting a comma and a space between each match.
  • Learning Regular Expressions

    Get a good regex tester like Kodos or RegexBuddy, and take a good look at regular-expressions.info if you need to practice. That's usually easier than trying to debug regexes in the Splunk command line. Also, try to work out exactly what's going on in each of the other examples people have posted -- getting a handle the examples is the key to being able to being able to adapt them to your own needs more quickly.

    bansi
    Path Finder
    0 Karma

    bansi
    Path Finder

    Thanks for wonderful explaination. Thats really informative

    0 Karma
    Get Updates on the Splunk Community!

    Introducing the 2024 SplunkTrust!

    Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

    Introducing the 2024 Splunk MVPs!

    We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

    Splunk Custom Visualizations App End of Life

    The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...