Dashboards & Visualizations

How can we index an entire XML document as one event?

ddrillic
Ultra Champion

We have data as XML documents. How can we index each XML document as one Splunk event?

A sample -

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<mlcpMetricsModel xmlns="http://xxxxxxx">
    <canonical>xxxxxx</canonical>
    <duration>PT41M3.401S</duration>
    <env>PROD</env>
    <ecmProcDateTime>20170727_M</ecmProcDateTime>
    <outputRecords>4948262</outputRecords>
    <outputRecordsCommitted>4948262</outputRecordsCommitted>
    <outputRecordsFailed>0</outputRecordsFailed>
    <reportDate>2017-08-02T10:49:31</reportDate>
    <source>CDB</source>
    <startTime>2017-08-02T10:08:18.512</startTime>

</mlcpMetricsModel>
Tags (1)
0 Karma
1 Solution

somesoni2
Revered Legend

Try this

[yoursourcetype]
LINE_BREAKER = ([\r\n]+)(?=\<mlcpMetricsModel )
SHOULD_LINEMERGE = false
TIME_PREFIX = \<reportDate\>
TIME_FORMAT = %Y-%m-%dT%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD = 19

View solution in original post

somesoni2
Revered Legend

Try this

[yoursourcetype]
LINE_BREAKER = ([\r\n]+)(?=\<mlcpMetricsModel )
SHOULD_LINEMERGE = false
TIME_PREFIX = \<reportDate\>
TIME_FORMAT = %Y-%m-%dT%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD = 19

ddrillic
Ultra Champion

Now that I'm preparing for the admin certification, I wonder why @somesoni2 set MAX_TIMESTAMP_LOOKAHEAD = 19, which obviously works but based on the data it appears that the value should be much higher.

0 Karma

somesoni2
Revered Legend

The timestamp is extracted from <reportDate> tag which is 2017-08-02T10:49:31 , 19 character long value.

ddrillic
Ultra Champion

got it ; -) so, if TIME_PREFIX exists it starts from there, otherwise, from the beginning of the line.

somesoni2
Revered Legend

(thumbs up)

Just in case it's still confusing for anyone, it's the length of timestamp string represented by TIME_FORMAT string. (%Y-%m-%dT%H:%M:%S => %Y(4)-(1)%m(2)-(1)%d(2)T(1)%H(2):(1)%M(2):(1)%S(2) => 4+1+2+1+2+1+2+1+2+1+2 =19)

0 Karma

niketn
Legend

@somesoni2, I think \<startTime\> is a better candidate for TimeStamp. However, @ddrillic must confirm.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

ddrillic
Ultra Champion

Thank you both - let me check...

0 Karma

ddrillic
Ultra Champion

Gorgeous - thank you !! we ended up extracting the XML fields using a series such as - spath mlcpMetricsModel.env ...

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...