Dashboards & Visualizations

Parse XML file with repeating events and indexed as one event

i2sheri
Communicator

I am indexing an XML file which has the structure below with many other tags in, before, and after <a> and <b>

I need -> | eval diffInSec= end-start | table status start end diffInSec for each c below and for all events.

I'm indexing the whole file as a single event, but I want each status as one record from all events. Is it possible?

<?xml?>
<a>
    <b><c><c>

        <c>
            <status status="P" start="DT" end="DT"></status>
        </c>
        <c>
            <status status="P" start="DT" end="DT"></status>
        </c>
        <c>
            <status status="P" start="DT" end="DT"></status>
        </c>
        <c>
            <status status="P" start="DT" end="DT"></status>
        </c>

    </c></c></b>
</a>
0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

If you've KV_MODE=xml in props.conf (on search head), then Splunk should automatically extract fields (mostly multivalued) from your xml. Search for your data in Verbose mode and you should see fields (on left field sidebar) may be like this

a.b.c.status{@status}
a.b.c.status(@start}
a.b.c.status(@end}

Once you get these fields, use mvzip/mvexpand/rex to get these as separate row and get your table. May be like this (sample, adjust based on your fields)

your base search | table a.b.c.status* | rename a.b.c.status{@*} as * | eval temp=mvzip(status,mvzip(start,end,"#"),"#") | table temp | mvexpand temp | rex field=temp "(?<status>.*)#(?<start>.*)#(?<end>.*)#" | field - temp 

Above should give you a table with status, start and end. Then based on your timestamp format, your can calculate the different.

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

If you've KV_MODE=xml in props.conf (on search head), then Splunk should automatically extract fields (mostly multivalued) from your xml. Search for your data in Verbose mode and you should see fields (on left field sidebar) may be like this

a.b.c.status{@status}
a.b.c.status(@start}
a.b.c.status(@end}

Once you get these fields, use mvzip/mvexpand/rex to get these as separate row and get your table. May be like this (sample, adjust based on your fields)

your base search | table a.b.c.status* | rename a.b.c.status{@*} as * | eval temp=mvzip(status,mvzip(start,end,"#"),"#") | table temp | mvexpand temp | rex field=temp "(?<status>.*)#(?<start>.*)#(?<end>.*)#" | field - temp 

Above should give you a table with status, start and end. Then based on your timestamp format, your can calculate the different.

i2sheri
Communicator

which is better of the two below

host=h index=i sourcetype=xml
| eval start=strptime('a.b.c.status{@starttime}',"%Y%m%d %H:%M:%S.%3Q")
| eval end=strptime('a.b.c.status{@endtime}',"%Y%m%d %H:%M:%S.%3Q")
| eval temp=mvzip('a.b.c.status{@status}', mvzip(end, start, "#"), "#") 
| table temp | mvexpand temp | rex field=temp "(?<status>.*)#(?<end>.*)#(?<start>.*)" 
| eval diffInSec = end-start
| fields status diffInSec end start

OR

host=h index=i sourcetype=xml
| table a.b.c.status* | rename a.b.c.status{@*} as *
| eval temp=mvzip(status,mvzip(starttime,endtime,"#"),"#") 
| table temp | mvexpand temp | rex field=temp "(?<status>.*)#(?<starttime>.*)#(?<endtime>.*)"
| eval start=strptime(starttime,"%Y%m%d %H:%M:%S.%3Q")
| eval end=strptime(endtime,"%Y%m%d %H:%M:%S.%3Q")
| eval diffInSec=end-start | fields status diffInSec end start
0 Karma

i2sheri
Communicator

rex does not work. pasted one of the temp entries below.

PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS PASS#20150921 06:00:15.379 20150921 06:00:15.395 20150921 06:00:15.408 20150921 06:00:15.415 20150921 06:00:15.433 20150921 06:00:15.445 20150921 06:00:15.452 20150921 06:00:15.457 20150921 06:00:15.485 20150921 06:00:15.494 20150921 06:00:15.499 20150921 06:00:15.513 20150921 06:00:15.523 20150921 06:00:15.556 20150921 06:00:15.570#20150921 06:00:15.385 20150921 06:00:15.405 20150921 06:00:15.413 20150921 06:00:15.419 20150921 06:00:15.437 20150921 06:00:15.450 20150921 06:00:15.455 20150921 06:00:15.473 20150921 06:00:15.492 20150921 06:00:15.497 20150921 06:00:15.509 20150921 06:00:15.520 20150921 06:00:15.527 20150921 06:00:15.568 20150921 06:00:15.572 
0 Karma

i2sheri
Communicator

I removed |xmlkv from base search and it works with two changes

 your base search | table a.b.c.status* | rename a.b.c.status{@*} as * | eval temp=mvzip(status,mvzip(start,end,"#"),"#") | table temp | mvexpand temp | rex field=temp "(?<status>.*)#(?<start>.*)#(?<end>.*)" | fields - temp 
0 Karma

richgalloway
SplunkTrust
SplunkTrust

It's probably not the best way, but you could do it with rex. That will give you three multi-valued fields. Combine them with mvzip then split the result into separate events and should have what you want.

... | rex max_match=0 "status=\"(?P<status>[^\"]*)\" start=\"(?P<start>[^\"]*)\" end=\"(?P<end>[^\"]*)\"" | eval fields = mvzip(mvzip(status,start),end) | mvexpand fields | ...
---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...