Dashboards & Visualizations

XML Field extraction

Sheela
Path Finder

I'm trying to extract XML fields from a report which is about 70-80 lines (maybe more). I receive the whole report as a single event because breaking it would make the report lose its meaning. I have been researching and trying out various means of field extraction for this report but nothing has worked out so far. If anyone can help me out with this, it'd be great.
I tried xmlkv, spath, xpath, manual regex field extraction. When I try manual field extraction or xmklkv, it matches only the last occurence of the tag. For example, consider the following code sample:





192.168.X.X
netsaint (5666/tcp)
High


192.168.X.X
ssh (22/tcp)
Low



When I use regex for field extraction or when I use xmlkv for say field level, I get only the last value (Low). Also, spath by default extracts fields from the first 5000 characters and I understand this can be changed in limits.conf, but I don't know how many characters my report would contain, so I dont know what I should set the value to. When I try spath like so:
whatever_search|spath output=host path=objects.object.ip|top host
the field host contains the whole xml report and not just the field I'm looking for. Can someone please suggest some alternative/solution to this? I have no option but using XML for this.

Tags (2)
0 Karma

tskinnerivsec
Contributor

Kristian, I just wanted to say thanks for the tip. I've been able to successfully use this method to do field extractions in some xml logs I'm working with.

0 Karma

Sheela
Path Finder

did you find a solution?

0 Karma

kristian_kolb
Ultra Champion

Have you looked at MV_ADD=true in order to get more than the last value?

Basically, you need to do the following changes/additions;

in props.conf

[your_xml_sourcetype]
REPORT-gettin_da_levels = da_level

in transforms.conf

[da_level]
REGEX = <level>([^<]+)<
FORMAT = myLevel::$1
MV_ADD = True 

Hope this helps,

Kristian

kristian_kolb
Ultra Champion

You're most welcome 🙂

0 Karma

Sheela
Path Finder

Kristian: Thank you very much for your help. Yours is the first solution that worked for me. Really appreciate all the help. Thank you!

0 Karma

kristian_kolb
Ultra Champion

Sheela: Glad it worked so far. As for context.. I don't know. Not all that familiar with working with XML files. I guess you have tried the xpath & spath commands which supposedly do this kind of thing. sorry...

tb5821: I know that extract(kv) has an mv_add option which can be used inline, however I don't think it'll work here.

0 Karma

tb5821
Communicator

Do you have to make changes to config files, is there a way to only do it via search?

0 Karma

Sheela
Path Finder

Thanks Kristian! That worked like a charm. Can you also tell me how I can do the same thing while maintaining the level of nesting in xml? The reports I have are about 200 lines and deeply nested, do you have any suggestions on how I can extract fields so they make sense in their context?
For example, in the XML above, one host(192.168.X.Y) can have level high while another host(192.168.X.X) can have level low. Will be able to extract such context sensitive information?

0 Karma

tb5821
Communicator

I think I'm having a simlar issue over here: http://splunk-base.splunk.com/answers/45039/regex-text

I get the entire line not just the data between the two fields

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...