Dashboards & Visualizations

How to extract mixed data content from an event (XML and REGEX)?

llacoste
Path Finder

Hi all,

I am facing a problem I can not seem to solve:

I've got events of this kind:

[2017-07-21 11:06:44,007] INFO text text text text more text and text and more text: [<?xml version="1.0" encoding="UTF-8"?> ... XML CONTENT]

I am able to extract the interesting fields in the first part, but then I want to also be able to extract the XML from the second part...

I know we can extract the XML part in a field and use it with spath, the problem being you've got all this happening in the search bar.. I would like to know if there is anything I could do in the props.conf so everything get extracted automatically and available when searching the data in the interface?

I tried to define:

 EXTRACT-example = <REGEX HERE>
KV_MODE=XML

Thinking maybe it would extract what matches the regex and then use the KV_MODE to extract the second part with XML but... no luck.

Any idea on how I could achieve this please?

Thanks guys!

Laurent

Tags (2)

woodcock
Esteemed Legend

Have the "xml part" extracted into a field and then use the spath and/or xmlkv commands on that field in your SPL.

llacoste
Path Finder

Hi,
Thanks for the advice.
You're right, that's what I was going to do and what I've been answered already in this thread. However, as explained I was trying to do everything without using the spath or xmlkv in the SPL if that makes sense. I don't want to overload my searches or dashboards. I wanted the users to be able to have those fields extracted for them whenever they need to use the SPL.

So I was hoping Splunk could be really clever and make both extraction at the same time in the props.conf 🙂

What do you think?

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

Why do you need the fields extracted at index time? Splunk best practice is to do that at search time. You can create a Field Extraction for search time field extractions that is more flexible than the index time field extraction very easily, either using the field extraction tool, or just through the Settings -> Fields -> Field Extraction menus way. Either one works great. If you are good with regular expressions, you should have no problem. If you aren't, the field extraction tool may be your best bet.

0 Karma

llacoste
Path Finder

Well, in fact I am not going for index time extraction but search time extraction. The props.conf is in the Search Head app.
I know it is a best practice not to use the index time extraction if I can avoid.

Using the GUI field extraction feature would not achieve what I am trying to do here, because I could extract everything with regex manually... but I was wondering if we could let splunk make the hard work by using some kind of spath or kv_mode=xml inside the props.conf...
Hope that clarifies my need, thanks for your answer 🙂

0 Karma

gcusello
SplunkTrust
SplunkTrust

hi llacoste,
try with something like this:

.*\[(?<xml_field>\<\?xml version[^\]]*)

Bye.
Giuseppe

0 Karma

llacoste
Path Finder

Hi, thanks for your answer,

However that would only extract the xml part. But the full xml part without breaking down each "sub fields".

I have already extracted the full xml part in a field but after that I need to use spath from the search bar. That's not what I am looking for, I want to make splunk extract each sub fields from the XML alone.

Hope it makes sense...

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...