Splunk Search

How do you extract an XML tag from _raw field?

mukesh2019
Explorer

Hi all,

I'm new to Splunk and don't have much idea of regex.

I'm trying to extract the content of "faultstring" tag only if Detail="RetreiveClaims Service Response payload without Invalid Characters" out of below output .

Sample Input :-

2018-12-23 04:42:47,243 483592286   DEBUG   com.xxxx.ead.channels.services.cscden    Aep_HostName=hostname    ServiceOperation=CSCDental_RetrieveClaims     UserId=4234525__         xBroker_HostName=ABCDE|RequestDateTime=2018-12-23T04:42:40.739161|MessageTranID=xxxx-xxxx-xxx-xxx-xxxx|UserID=12345__|AppID=CSC_DENTAL|ExecGrpName=CSCDental414|TimeStamp=2018-12-23T04:42:47.151570|Message="com.xxxxx.ead.channels.integration.services.cscdental.account.xxxxxx.CSCDental_RetrieveClaims.RetrieveClaimsFailure"|Detail="RetreiveClaims Service Response payload without Invalid Characters"|DetailXML="<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:fn="http://www.w3.org/2005/xpath-functions">
<env:Body>
<env:Fault>
<faultcode>env:Client</faultcode>
<faultstring>Internal Error</faultstring>
<detail>
<device>DPQAS04 (SISC QA Bastion 4) U03DPB14</device>
<domain>ABCDEF-11500_QA_v000</domain>
<object>ABCDEFI_QA</object>
<date>2018-12-23-04:42:47 EST</date>
<tid>87917743</tid>
<server>abcdef.domain.com</server>
<port>12345</port>
<errormessage>Failed to process response headers</errormessage></detail></env:Fault></env:Body></env:Envelope>"  

I'm looking for something like :-

index=abcd_nontricare source="*.log" "abc" earliest=-1h  | head 1 | regex

Expected Output

Internal Error

0 Karma
1 Solution

kamlesh_vaghela
SplunkTrust
SplunkTrust

@mukesh2019

Can you please try this?

index=abcd_nontricare source="*.log" "abc" earliest=-1h  | head 1 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

My Sample Search:

| makeresults 
| eval _raw="2018-12-23 04:42:47,243 483592286   DEBUG   com.xxxx.ead.channels.services.cscden    Aep_HostName=hostname    ServiceOperation=CSCDental_RetrieveClaims     UserId=4234525__         xBroker_HostName=ABCDE|RequestDateTime=2018-12-23T04:42:40.739161|MessageTranID=xxxx-xxxx-xxx-xxx-xxxx|UserID=12345__|AppID=CSC_DENTAL|ExecGrpName=CSCDental414|TimeStamp=2018-12-23T04:42:47.151570|Message=\"com.xxxxx.ead.channels.integration.services.cscdental.account.xxxxxx.CSCDental_RetrieveClaims.RetrieveClaimsFailure\"|Detail=\"RetreiveClaims Service Response payload without Invalid Characters\"|DetailXML=\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
 <env:Envelope xmlns:env=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:fn=\"http://www.w3.org/2005/xpath-functions\">
 <env:Body>
 <env:Fault>
 <faultcode>env:Client</faultcode>
 <faultstring>Internal Error</faultstring>
 <detail>
 <device>DPQAS04 (SISC QA Bastion 4) U03DPB14</device>
 <domain>ABCDEF-11500_QA_v000</domain>
 <object>ABCDEFI_QA</object>
 <date>2018-12-23-04:42:47 EST</date>
 <tid>87917743</tid>
 <server>abcdef.domain.com</server>
 <port>12345</port>
 <errormessage>Failed to process response headers</errormessage></detail></env:Fault></env:Body></env:Envelope>\"  
" 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

Thanks

View solution in original post

a_m_s
Explorer

If you are explicitly looking for faultstring tag only then the following regex should work.

(?.*)<\/faultstring>

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@mukesh2019

Can you please try this?

index=abcd_nontricare source="*.log" "abc" earliest=-1h  | head 1 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

My Sample Search:

| makeresults 
| eval _raw="2018-12-23 04:42:47,243 483592286   DEBUG   com.xxxx.ead.channels.services.cscden    Aep_HostName=hostname    ServiceOperation=CSCDental_RetrieveClaims     UserId=4234525__         xBroker_HostName=ABCDE|RequestDateTime=2018-12-23T04:42:40.739161|MessageTranID=xxxx-xxxx-xxx-xxx-xxxx|UserID=12345__|AppID=CSC_DENTAL|ExecGrpName=CSCDental414|TimeStamp=2018-12-23T04:42:47.151570|Message=\"com.xxxxx.ead.channels.integration.services.cscdental.account.xxxxxx.CSCDental_RetrieveClaims.RetrieveClaimsFailure\"|Detail=\"RetreiveClaims Service Response payload without Invalid Characters\"|DetailXML=\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
 <env:Envelope xmlns:env=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:fn=\"http://www.w3.org/2005/xpath-functions\">
 <env:Body>
 <env:Fault>
 <faultcode>env:Client</faultcode>
 <faultstring>Internal Error</faultstring>
 <detail>
 <device>DPQAS04 (SISC QA Bastion 4) U03DPB14</device>
 <domain>ABCDEF-11500_QA_v000</domain>
 <object>ABCDEFI_QA</object>
 <date>2018-12-23-04:42:47 EST</date>
 <tid>87917743</tid>
 <server>abcdef.domain.com</server>
 <port>12345</port>
 <errormessage>Failed to process response headers</errormessage></detail></env:Fault></env:Body></env:Envelope>\"  
" 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

Thanks

mukesh2019
Explorer

Thanks a lot 🙂

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...