Splunk Search

Search on XML Multiple Key Attribute Pairs

bansi
Path Finder

I am stranded extracting "values" from below xml

   <SearchElements>
    <entry key="FirstName">%</entry>
    <entry key="Gender">MALE</entry>
    <entry key="State">VA</entry>   
</SearchElements> 

I am expecting regex to give me output of values as:

%, MALE, VA 

Here is the regex which doesnt work as expected. Please let me know whats going wrong

rex field=abc "(?ms)\<entry key="\w+"\>(?P<abc>[^<]+)<\entry>"
Tags (2)
0 Karma
1 Solution

southeringtonp
Motivator

You don't need to match the whole tag - just match everything up to the start of the next one. Also, be careful of your slashes - in your example you have <\entry> instead of </entry>, and remember to escape the quotes.

| rex max_match=50 field=abc "(?ms)\<entry key=\"\w+\"\>(?<value>[^\<]+)"
| eval valuelist=mvjoin(value, ", ")

Here's how the regex will be processed:

  • Look for the exact leading text <entry key="
  • Movepast one or more "word" characters, indicated by \w
  • Move past a quotation mark and a close-bracket
  • Fill the named capture group "value" with one or more characters that are not an open-bracket symbol
  • That's it - you're done! The only reason to keep matching would be if you either had multiple similar formats, or if you needed to capture more fields.

    Here's what the Splunk commands are doing:

  • rex will repeat the regex processing up to 50 times until all matches are found. Put each match of the named capture group value into a field named value.
  • eval will then join all of these matches into a single line of text, putting a comma and a space between each match.
  • Learning Regular Expressions

    Get a good regex tester like Kodos or RegexBuddy, and take a good look at regular-expressions.info if you need to practice. That's usually easier than trying to debug regexes in the Splunk command line. Also, try to work out exactly what's going on in each of the other examples people have posted -- getting a handle the examples is the key to being able to being able to adapt them to your own needs more quickly.

    View solution in original post

    southeringtonp
    Motivator

    You don't need to match the whole tag - just match everything up to the start of the next one. Also, be careful of your slashes - in your example you have <\entry> instead of </entry>, and remember to escape the quotes.

    | rex max_match=50 field=abc "(?ms)\<entry key=\"\w+\"\>(?<value>[^\<]+)"
    | eval valuelist=mvjoin(value, ", ")
    

    Here's how the regex will be processed:

  • Look for the exact leading text <entry key="
  • Movepast one or more "word" characters, indicated by \w
  • Move past a quotation mark and a close-bracket
  • Fill the named capture group "value" with one or more characters that are not an open-bracket symbol
  • That's it - you're done! The only reason to keep matching would be if you either had multiple similar formats, or if you needed to capture more fields.

    Here's what the Splunk commands are doing:

  • rex will repeat the regex processing up to 50 times until all matches are found. Put each match of the named capture group value into a field named value.
  • eval will then join all of these matches into a single line of text, putting a comma and a space between each match.
  • Learning Regular Expressions

    Get a good regex tester like Kodos or RegexBuddy, and take a good look at regular-expressions.info if you need to practice. That's usually easier than trying to debug regexes in the Splunk command line. Also, try to work out exactly what's going on in each of the other examples people have posted -- getting a handle the examples is the key to being able to being able to adapt them to your own needs more quickly.

    bansi
    Path Finder
    0 Karma

    bansi
    Path Finder

    Thanks for wonderful explaination. Thats really informative

    0 Karma
    Get Updates on the Splunk Community!

    Extending Observability Content to Splunk Cloud

    Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

    More Control Over Your Monitoring Costs with Archived Metrics!

    What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

    New in Observability Cloud - Explicit Bucket Histograms

    Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...