Splunk Search

Error when extracting multivalue field from xml data via rex command

jurbain
New Member

Hi

I need to extract multivalue field from an event structured in xml.

<job>

<nameJob>Job1</nameJob>

<executionJob>



2016-10-03
12:31:25
ServerX
JobUser




Step 1

Clean directories A
2016/10/03 12:31:25

0 file(s) AND 0 folder(s)


rc=0

2016/10/03 12:31:26



DirClean

Clean Directories B
2016/10/03 12:31:26


==========================================



10 file(s) AND 0 folder(s)


rc=0

2016/10/03 12:31:27



2016-10-03

12:31:27
grc=0


</executionJob>
</job>

I can not use xpath and KV_MODE=xml because some events have special characters which prevents the parsing.

I am trying to use regular expression with the command rex for example extract the steps data

The regular expression "&lt;steps>(?<abc>((?!<\/steps>)[\s\S])*)<\/steps>" works well in a regular expression tester tool (pcre) but when I am trying to do the same with Splunk with the following command:

"basic search | rex field=_raw "&lt;steps>(?<abc>((?!</steps>)[\s\S])*)<\/steps>" max_match=999"

I am getting the error message :

"Streamed search execute failed because: Error in 'rex' command: Regex match error, please check log"

Do you have any idea what is going wrong?

Thanks.
J.

Tags (1)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

The regex string is missing an escape character. Try

\<steps>(?<abc>((?!<\/steps>)[\s\S])*)<\/steps>

That said, unless you need the entire XML indexed, you should consider using a scripted input to parse the XML and extract only the needed fields for indexing. A python parser will be much easier to write and will save license and storage costs by reducing the XML verbosity.

---
If this reply helps you, Karma would be appreciated.
0 Karma

jurbain
New Member

Hi

I am still get the error in the rex command (it does not like [\s\S]).
I will probably follow your recommendation and implement an python parser to extract the information.

Thanks

0 Karma

lukejadamec
Super Champion

Can you list the values you are trying to extract for 'steps' from the event you posted?

0 Karma

jurbain
New Member

Hi Luke

The xml format was not rendered correctly in my question, the xml structure per events is :

    <job> 
        <nameJob>Job1</nameJob> 
        <executionJob> 
            <started> 
                <dateS>2016-10-03</dateS> 
                <timeS>12:31:25</timeS> 
                <serverS>ServerX</serverS> 
                <userS>JobUser</userS> 
            </started> 
            <steps> 
                <nameStep>Step 1</nameStep> 
                <descrStep>Clean directories A </descrStep> 
                <beginTimeStep>2016/10/03 12:31:25</beginTimeStep> 
                <comStep> 
                      <comment>Starting</comment> 
                 </comStep> 
                 <comStep> 
                      <comment>Execution....</comment> 
                 </comStep> 
                  <comStep> 
                     <comment>0 file(s) AND 0 folder(s)</comment> 
                </comStep> 
                <rcStep>rc=0</rcStep> 
                <endTimeStep>2016/10/03 12:31:26</endTimeStep> 
            </steps> 
            <steps> 
                <nameStep>Step 2</nameStep> 
                <descrStep>Clean Directories B</descrStep> 
                <beginTimeStep>2016/10/03 12:31:26</beginTimeStep> 
                 <comStep> 
                      <comment>Starting</comment> 
                 </comStep> 
                 <comStep> 
                       <comment>Execution....</comment> 
                </comStep> 
                <comStep> 
                    <comment>10 file(s) AND 0 folder(s)</comment> 
                </comStep> 
                <rcStep>rc=0</rcStep> 
                <endTimeStep>2016/10/03 12:31:27</endTimeStep> 
            </steps> 
            <ended> 
                <dateE>2016-10-03</dateE> 
                <timeE>12:31:27</timeE> 
                <globalRcE>grc=0</globalRcE> 
            </ended> 
        </executionJob> 
    </job>

So for 1 job event, I have several steps having a set of properties. And some of these properties, there is also multivalue like which provide the output of each step

My final objective is to get something like

nameJob nameStep    descrStep               beginTimeStep       comment
Job1           Step 1       Clean directories A     2016/10/03 12:31:25     Starting
                                                                            Execution...
                                                                            0 file(s) AND 0 folder(s)
Job1           Step 2       Clean Directories B     2016/10/03 12:31:26     Starting
                                                                            Execution...
                                                                            10 file(s) AND 0 folder(s)
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...