Splunk Search

How to write the regex to extract a field from XML data if the field is not completely XML?

jameskerivan
Explorer

Hi

I have a field which I would like to extract a field from the XML being displayed. The only problem is the field is not completely XML. I am not allowed to post an example, but basically I want to extract something that looks like:

Event xml

<?xml version="1.0" encoding="UTF-8" standalone="yes"><ns2:behaviorVersion>0</ns2:behaviorVersion><triggers><channelId>0055</channelId><clientVersion>3</clientVersion></triggers><eventInfo><bos:instanceId>000121481</bos:instanceId><bos:serverName>1</bos:serverName><bos:implementationName>TransferStarted</bos:implementationName>

And I would like to grab TransferStarted in between the two tags <bos:implementationName> and </bos:implementationName>.

I have worked with regex in the past, but am still not confident. Any help would be much appreciated and Happy New Year!

0 Karma
1 Solution

sundareshr
Legend

Have you tried this

implementationName\>(\w+)\<

View solution in original post

sundareshr
Legend

Have you tried this

implementationName\>(\w+)\<

jameskerivan
Explorer

Yes this is what I want. Right now I am doing

base query | rex field=F "(?.*)implementationName\>(\w+)\<" | stats count by preName | sort count desc

But this is providing me with everything before implementationName as I specified. How would I extract that field? The way I see the regex working is it matches implementationName and looks for the characters > < for opening and closing of the value I want. Do I need to specify a variable for that value?

0 Karma

sundareshr
Legend

Try this, assuming preName is the name you want for that field.

"implementationName>(?<preName>w+)<"
0 Karma

sundareshr
Legend

There should be a backslash before "w+"

0 Karma

jameskerivan
Explorer

So the stats that it gives me is very confusing. Here is my query :

base query | rex field=F "implementationName>(?<preName>\w+)<" | stats count by preName | sort count desc

This is giving me a very small amount of the implemenationNames but it does not give them all. For example TransferStarted did not get counted in my stats but if I look in the events I can see it. Am I missing something?

0 Karma

sundareshr
Legend

If there is more than 1 occurrence of the preName in one event, you should add max_match=0 to the rex command and used multi-value functions to get the right result

0 Karma

jameskerivan
Explorer

Thank you very much. You have been so helpful. The problem I am coming across is with the way we are logging. Your query is correct!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...