Splunk Search

How can I extract specific data from a complex block of XML?

jtracy
Engager

So I've been reading around and most people point to xpath, but after hours of troubleshooting I can't seem to get it to work in my scenario. I'm trying to extract decision and reasoncode as fields and report on them, but I just can't seem to get Splunk to recognize them in all of the outputs.

So far I have this search working to report all failures (kind of). But I'm hoping for something better.

Sample Search:

host=relevanthost "<c:decision>" "<c:reasonCode>" NOT "<c:decision>ACCEPT</c:decision>"  
|xmlkv |timechart count by c:reasonCode limit=25

Sample Code:

[2016-05-timestampstuff] [HOSTNAME] [NOTIFICATION] [numbers] [oracle.router.stuff] [tid: [ACTIVE].ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)'] [userId: <fakeuser>] [ecid: fakenumbersandlettersecid] [oracle.soa.tracking.FlowId: 000000000] [oracle.soa.tracking.InstanceId: 000000000] [oracle.soa.tracking.SCAEntityId: 00000] [FlowId:fakenumbersandletters] [[
 [OSB Tracing] Entering pipeline pair Process Request with message context: 
 [MessageContextImpl  uid="fakeuidlettersandnumbers"
 body="<soap:Body xmlns:soap="http://schemas.fake.url/soap/stuff">
  <c:replyMessage xmlns:c="urn:fake.data.here">
    <c:merchantReferenceCode>0000000000</c:merchantReferenceCode>
    <c:requestID>0000000000000000000000</c:requestID>
    <c:decision>ACCEPT</c:decision>
    <c:reasonCode>100</c:reasonCode>
    <c:requestToken>faketokenlettersandnumbers</c:requestToken>
    <c:purchaseTotals>
      <c:currency>USD</c:currency>
    </c:purchaseTotals>
    <c:ccAuthReply>
      <c:reasonCode>100</c:reasonCode>
      <c:amount>99.99</c:amount>
      <c:authorizationCode>000000</c:authorizationCode>
      <c:avsCode>X</c:avsCode>
      <c:avsCodeRaw>A3</c:avsCodeRaw>
      <c:cvCode>S</c:cvCode>
      <c:cvCodeRaw>S</c:cvCodeRaw>
0 Karma
1 Solution

sundareshr
Legend

If you're only interested in the two fields, you're probably better off using rex. Like this

.... | rex ield=_raw "decision\>(?<decision>[^\<]+)\<" | rex field=_raw "reasonCode\>(?<reasonCode>[^\<]+)\<\/" | where decision!="ACCEPT" | timechart count by reasonCode

View solution in original post

sundareshr
Legend

If you're only interested in the two fields, you're probably better off using rex. Like this

.... | rex ield=_raw "decision\>(?<decision>[^\<]+)\<" | rex field=_raw "reasonCode\>(?<reasonCode>[^\<]+)\<\/" | where decision!="ACCEPT" | timechart count by reasonCode

jtracy
Engager

This worked! Thanks! Would I be able to somehow use this to extrapolate what percentage each value for reasonCode represents in all results with "ACCEPT" as a value for decision in a timechart?

0 Karma

sundareshr
Legend

Try this

.... | rex ield=_raw "decision\>(?<decision>[^\<]+)\<" | rex field=_raw "reasonCode\>(?<reasonCode>[^\<]+)\<\/" | where decision!="ACCEPT" | eventstats count as total by reasonCode | eval p=1/total | bin span=1d _time as time | stats sum(p) as perc by time reasonCode | eval perc=tostring(round(perc, 2), "commas") | eval time=strftime(time, "%x %X") | xyseries time reasonCode perc
0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...