Splunk Search

regex field extraction

ialahdal
Path Finder

I have an event that is in an HTML tag format, I'd like to extract data within it in a specific manner, as follows:
<TAG1>Splunking</TAG1>

I was trying to extract the data by matching group1 "TAG1" to group2 "/TAG1" and extracting what's in between into a filed named the same as group1, is this possible??

The best I was able to achieve was this <([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>(.*?)<\/\1>
But that doesn't work in nested tags, I also don't know how to assign a filed to a group based on a previous one in splunk.

0 Karma
1 Solution

poete
Builder

Hello @ialahdal,

I think you should use spath in this case (https://docs.splunk.com/Documentation/SplunkCloud/8.0.0/SearchReference/Spath).

Please find below an example of use, with 2 levels of fields in the xml.

| makeresults 
| eval somefield="<level1><someFieldLevel1>someValueLevel1</someFieldLevel1><level2><someFieldLevel2>someValueLevel2</someFieldLevel2></level2></level1>"
| spath input=somefield

View solution in original post

poete
Builder

Hello @ialahdal,

I think you should use spath in this case (https://docs.splunk.com/Documentation/SplunkCloud/8.0.0/SearchReference/Spath).

Please find below an example of use, with 2 levels of fields in the xml.

| makeresults 
| eval somefield="<level1><someFieldLevel1>someValueLevel1</someFieldLevel1><level2><someFieldLevel2>someValueLevel2</someFieldLevel2></level2></level1>"
| spath input=somefield

ialahdal
Path Finder

This helped, thank you.

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...