Splunk Search

How can I search all the XML nested data?

Mtakahashi
Path Finder

Dear all,

I need to search all XML tagged data including nested data but I only get first data by a search command.
Please help me how can I search all the XML data? Splunk version is 6.5.3

Here are details
(1) source XML file
(2) props.conf
(3) transforms.conf
(4) Search SPL
(5) Search result (screen shot)

(1) source XML file (Samle_XML_for_answers.xml)

<?xml version="1.0" encoding="Shift-JIS"?><Root><DataArea><Businessname>12345</Businessname><Contractcd>ABC1234567890</Contractcd><ReceptNumber>C12345</ReceptNumber><ChargeList><Charge><Ratingname>Contract</Ratingname><Aprice>800</Aprice><Count></Count></Charge><Charge><Ratingname>Product</Ratingname><Aprice>0</Aprice><Count></Count></Charge><Charge><Ratingname>Base</Ratingname><Aprice>4500</Aprice><Count></Count></Charge><Charge><Ratingname>Exchange</Ratingname><Aprice>1000</Aprice><Count></Count></Charge><Charge><Ratingname>Cabling</Ratingname><Aprice>10400</Aprice><Count></Count></Charge></ChargeList><PreviousALService></PreviousALService></DataArea></Root>

(2) props.conf

[(mysourcetype)]
CHARSET = SJIS
NO_BINARY_CHECK =1
pulldown_type = 1
REPORT-xmlkv = xmlkv-sample

(3) transforms.conf

[xmlkv-sample]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2

(4) Search SPL

source="Samle_XML_for_answers.xml"
 | table Businessname, Contractcd, ReceptNumber, Ratingname, Aprice, Count, PreviousALService
 | transpose

(5) Search result (screen shot)

alt text

0 Karma
1 Solution

niketn
Legend

[Updated Answer]

Seems like there will not be any other way but to fill empty count XML data as 0 in the raw data itself : | eval xmlData=replace(xmlData, "<Count></Count>", "<Count>0</Count>")

   | makeresults
   | eval xmlData="
 <?xml version=\"1.0\" encoding=\"Shift-JIS\"?><Root><DataArea><Businessname>12347</Businessname><Contractcd>ABC1234567892</Contractcd><ReceptNumber>C12347</ReceptNumber><ChargeList><Charge><Ratingname>Contract</Ratingname><Aprice>800</Aprice><Count></Count></Charge><Charge><Ratingname>Product</Ratingname><Aprice>0</Aprice><Count>100</Count></Charge><Charge><Ratingname>Base</Ratingname><Aprice>4500</Aprice><Count></Count></Charge><Charge><Ratingname>Exchange</Ratingname><Aprice>1000</Aprice><Count>50</Count></Charge><Charge><Ratingname>Cabling</Ratingname><Aprice>10400</Aprice><Count></Count></Charge></ChargeList><PreviousALService></PreviousALService></DataArea></Root>
 "
   | eval xmlData=replace(xmlData, "<Count></Count>", "<Count>0</Count>")
   | spath input=xmlData output=Ratingname path=Root.DataArea.ChargeList.Charge.Ratingname
   | spath input=xmlData output=Aprice path=Root.DataArea.ChargeList.Charge.Aprice
   | spath input=xmlData output=Count path=Root.DataArea.ChargeList.Charge.Count
   | eval iteratorMV=mvrange(0,mvcount(Ratingname),1)
   | mvexpand iteratorMV  
   | eval Ratingname=mvindex(Ratingname,iteratorMV)
   | eval Aprice=mvindex(Aprice,iteratorMV)
   | eval Count=mvindex(Count,iteratorMV)
   | table Ratingname Aprice Count

Since your Count field is null you have less number of multi-valued fields in the Count field. Because of this the mvzip command is stitching only those multivalued fields corresponding to which there is Count data. You need a different approach here:

| makeresults
| eval xmlData="
<?xml version=\"1.0\" encoding=\"Shift-JIS\"?><Root><DataArea><Businessname>12347</Businessname><Contractcd>ABC1234567892</Contractcd><ReceptNumber>C12347</ReceptNumber><ChargeList><Charge><Ratingname>Contract</Ratingname><Aprice>800</Aprice><Count></Count></Charge><Charge><Ratingname>Product</Ratingname><Aprice>0</Aprice><Count>100</Count></Charge><Charge><Ratingname>Base</Ratingname><Aprice>4500</Aprice><Count></Count></Charge><Charge><Ratingname>Exchange</Ratingname><Aprice>1000</Aprice><Count>50</Count></Charge><Charge><Ratingname>Cabling</Ratingname><Aprice>10400</Aprice><Count></Count></Charge></ChargeList><PreviousALService></PreviousALService></DataArea></Root>
"
  | spath input=xmlData output=Ratingname path=Root.DataArea.ChargeList.Charge.Ratingname
  | spath input=xmlData output=Aprice path=Root.DataArea.ChargeList.Charge.Aprice
  | spath input=xmlData output=Count path=Root.DataArea.ChargeList.Charge.Count
  | eval iteratorMV=mvrange(0,mvcount(Ratingname),1)
  | mvexpand iteratorMV  
  | eval Ratingname=mvindex(Ratingname,iteratorMV)
  | eval Aprice=mvindex(Aprice,iteratorMV)
  | eval Count=if(isnull(mvindex(Count,iteratorMV)),0,mvindex(Count,iteratorMV))
  | table Ratingname Aprice Count 

PS: Using the mvrange() command I have created a dummy iterator assuming RatingName will always be present in XML Data. With mvindex() command, I am creating single value fields and for Count I have replaced with 0 when mvindex() returns null.

Read Documentation for details for working with Multi Valued fields: http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/MultivalueEvalFunctions

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

niketn
Legend

[Updated Answer]

Seems like there will not be any other way but to fill empty count XML data as 0 in the raw data itself : | eval xmlData=replace(xmlData, "<Count></Count>", "<Count>0</Count>")

   | makeresults
   | eval xmlData="
 <?xml version=\"1.0\" encoding=\"Shift-JIS\"?><Root><DataArea><Businessname>12347</Businessname><Contractcd>ABC1234567892</Contractcd><ReceptNumber>C12347</ReceptNumber><ChargeList><Charge><Ratingname>Contract</Ratingname><Aprice>800</Aprice><Count></Count></Charge><Charge><Ratingname>Product</Ratingname><Aprice>0</Aprice><Count>100</Count></Charge><Charge><Ratingname>Base</Ratingname><Aprice>4500</Aprice><Count></Count></Charge><Charge><Ratingname>Exchange</Ratingname><Aprice>1000</Aprice><Count>50</Count></Charge><Charge><Ratingname>Cabling</Ratingname><Aprice>10400</Aprice><Count></Count></Charge></ChargeList><PreviousALService></PreviousALService></DataArea></Root>
 "
   | eval xmlData=replace(xmlData, "<Count></Count>", "<Count>0</Count>")
   | spath input=xmlData output=Ratingname path=Root.DataArea.ChargeList.Charge.Ratingname
   | spath input=xmlData output=Aprice path=Root.DataArea.ChargeList.Charge.Aprice
   | spath input=xmlData output=Count path=Root.DataArea.ChargeList.Charge.Count
   | eval iteratorMV=mvrange(0,mvcount(Ratingname),1)
   | mvexpand iteratorMV  
   | eval Ratingname=mvindex(Ratingname,iteratorMV)
   | eval Aprice=mvindex(Aprice,iteratorMV)
   | eval Count=mvindex(Count,iteratorMV)
   | table Ratingname Aprice Count

Since your Count field is null you have less number of multi-valued fields in the Count field. Because of this the mvzip command is stitching only those multivalued fields corresponding to which there is Count data. You need a different approach here:

| makeresults
| eval xmlData="
<?xml version=\"1.0\" encoding=\"Shift-JIS\"?><Root><DataArea><Businessname>12347</Businessname><Contractcd>ABC1234567892</Contractcd><ReceptNumber>C12347</ReceptNumber><ChargeList><Charge><Ratingname>Contract</Ratingname><Aprice>800</Aprice><Count></Count></Charge><Charge><Ratingname>Product</Ratingname><Aprice>0</Aprice><Count>100</Count></Charge><Charge><Ratingname>Base</Ratingname><Aprice>4500</Aprice><Count></Count></Charge><Charge><Ratingname>Exchange</Ratingname><Aprice>1000</Aprice><Count>50</Count></Charge><Charge><Ratingname>Cabling</Ratingname><Aprice>10400</Aprice><Count></Count></Charge></ChargeList><PreviousALService></PreviousALService></DataArea></Root>
"
  | spath input=xmlData output=Ratingname path=Root.DataArea.ChargeList.Charge.Ratingname
  | spath input=xmlData output=Aprice path=Root.DataArea.ChargeList.Charge.Aprice
  | spath input=xmlData output=Count path=Root.DataArea.ChargeList.Charge.Count
  | eval iteratorMV=mvrange(0,mvcount(Ratingname),1)
  | mvexpand iteratorMV  
  | eval Ratingname=mvindex(Ratingname,iteratorMV)
  | eval Aprice=mvindex(Aprice,iteratorMV)
  | eval Count=if(isnull(mvindex(Count,iteratorMV)),0,mvindex(Count,iteratorMV))
  | table Ratingname Aprice Count 

PS: Using the mvrange() command I have created a dummy iterator assuming RatingName will always be present in XML Data. With mvindex() command, I am creating single value fields and for Count I have replaced with 0 when mvindex() returns null.

Read Documentation for details for working with Multi Valued fields: http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/MultivalueEvalFunctions

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

Mtakahashi
Path Finder

Thank you very much for your update.

I executed the search and found that "Count" value is listed on the wrong record.
In the source file, Count=100 is related to "Product" and Count=50 is related to "Exchange"
but it looks Count values are moved to top.

I inserted | eval Count = if(isnull(Count), 0, Count) but it didn't work..

1   Contract    800 100
2   Product 0   50
3   Base    4500    0
4   Exchange    1000    0
5   Cabling 10400   0

Is there any way to properly correlate field values?
Here is the expected result.

    1   Contract    800 0
    2   Product 0   100
    3   Base    4500    0
    4   Exchange    1000    50
    5   Cabling 10400   0
0 Karma

Mtakahashi
Path Finder

Thanks for your confirmation. I really appreciate to your help!

0 Karma

niketn
Legend

@Mtakahashi, please accept and up vote the answer if it has helped.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

niketn
Legend

I have updated my answer. You will have to replace <Count></Count> with <Count>0</Count> in your raw data.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

niketn
Legend

@Mtakahashi, I uploaded your sample data with KV_MODE=xml defined in the sourcetype.
I was able to get Ratingname and Aprice automatically extracted at search time as Multi Valued fields Root.DataArea.ChargeList.Charge.Ratingname and Root.DataArea.ChargeList.Charge.Aprice respectively:

alt text

Refer to documentation: https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Automatickey-valuefieldextractionsatse...

Alternatively you can also try using spath at search time to parse XML Data. Following is a run anywhere search based on your data:

| makeresults 
| eval xmlData="<?xml version=\"1.0\" encoding=\"Shift-JIS\"?><Root><DataArea><Businessname>12345</Businessname><Contractcd>ABC1234567890</Contractcd><ReceptNumber>C12345</ReceptNumber><ChargeList><Charge><Ratingname>Contract</Ratingname><Aprice>800</Aprice><Count></Count></Charge><Charge><Ratingname>Product</Ratingname><Aprice>0</Aprice><Count></Count></Charge><Charge><Ratingname>Base</Ratingname><Aprice>4500</Aprice><Count></Count></Charge><Charge><Ratingname>Exchange</Ratingname><Aprice>1000</Aprice><Count></Count></Charge><Charge><Ratingname>Cabling</Ratingname><Aprice>10400</Aprice><Count></Count></Charge></ChargeList><PreviousALService></PreviousALService></DataArea></Root>"
| spath input=xmlData path=Root.DataArea.ChargeList.Charge.Ratingname output=Ratingname
| spath input=xmlData path=Root.DataArea.ChargeList.Charge.Aprice output=Aprice
| eval Charge=mvzip(Aprice,Ratingname)
| table Charge
| mvexpand Charge
| eval Charge=split(Charge,",")
| eval Aprice=mvindex(Charge,0)
| eval Ratingname=mvindex(Charge,1)
| table Aprice Ratingname
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Mtakahashi
Path Finder

Hi, niketnilay
Many thanks for your reply. I could get XML data with using spath.
However, it didnt work for another sample XML data it has some "Count" value.

Here is another XML sample.

<?xml version="1.0" encoding="Shift-JIS"?><Root><DataArea><Businessname>12347</Businessname><Contractcd>ABC1234567892</Contractcd><ReceptNumber>C12347</ReceptNumber><ChargeList><Charge><Ratingname>Contract</Ratingname><Aprice>800</Aprice><Count></Count></Charge><Charge><Ratingname>Product</Ratingname><Aprice>0</Aprice><Count>100</Count></Charge><Charge><Ratingname>Base</Ratingname><Aprice>4500</Aprice><Count></Count></Charge><Charge><Ratingname>Exchange</Ratingname><Aprice>1000</Aprice><Count>50</Count></Charge><Charge><Ratingname>Cabling</Ratingname><Aprice>10400</Aprice><Count></Count></Charge></ChargeList><PreviousALService></PreviousALService></DataArea></Root>

and this is a search I executed

source="Samle_XML_for_answers3.xml"
 | spath output=Ratingname path=Root.DataArea.ChargeList.Charge.Ratingname
 | spath output=Aprice path=Root.DataArea.ChargeList.Charge.Aprice
 | spath output=Count path=Root.DataArea.ChargeList.Charge.Count
 | eval Charge=mvzip(mvzip(Ratingname, Aprice, ","), Count, ",")
 | mvexpand Charge
 | eval Charge=split(Charge,",")
 | eval Ratingname=mvindex(Charge,0) 
 | eval Aprice=mvindex(Charge,1) 
 | eval Count=mvindex(Charge,2)
 | table Ratingname, Aprice, Count

Then, I only get 2 records with inproper "Count" values

        Ratingname  Aprice  Count
    1   Contract    800 100
    2   Product 0   50

I expected to get 5 records

         Ratingname    Aprice    Count
     1    Contract    800
     2    Product    0    100
     3    Base    4500
     4    Exchange    1000    50
     5    Cabling    10400

Can you please let me know how can I get all the 5 record values..

Thanks

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...