Splunk Search

rex over multiple lines

tbasima1
Explorer

Dear all,

hope to find here some help.
I've tried now several things including searching in the answers here but don't find the solution.

I've for example a log file that is structured like that

<?xml version="1.0" encoding="UTF-8"?>
<PublishLog Version="5.0">
<RequestReport>
<Request>
<Id>Z00000hyjlq1l4Xpa3Z53MZbem7cZ</Id>
<StartTime>1407386816620</StartTime>
<User>cli@ss001500.tauri.ch</User>
<Type>ps_publish</Type>
<RequestTime>1407386816587</RequestTime>
<RequestMsg/>
<Description>cli for user cli@ss001500.tauri.ch</Description>
<ClientData/>
<Result>FLR</Result>
<EndTime>1407387275454</EndTime>
</Request>
<Replies>
<ReplyFirst>
<Time>1407386816719</Time>
<Result>ACK</Result>
<RequestId>Z00000hyjlq1l4Xpa3Z53MZbem7cZ</RequestId>
</ReplyFirst>
<ReplyLast>
<Time>1407387275454</Time>
<Result>FLR</Result>
<ResultNlMsg>
<NlMsgId>BMC-IPS000206I</NlMsgId>
</ResultNlMsg>
</ReplyLast>
</Replies>
</RequestReport>
....

With the rex expression

`rex field=_raw ".*FLR</Result>\s+<EndTime>(?<EndTime>.*?)</EndTime>"`

I get the EndTime value. No problem.

But now I want to search to the first FLR and then to the <NlMsgId>

`rex field=_raw ".*FLR</Result>[WHATISMISSINGHERE??]<NlMsgId>(?<BMCI>.*?)</NlMsgId>"`

What I've to set for a regular expression that it leaves out the text between

`FLR</Result>`

and

`<NlMsgId>`

?

I can't search directly for the <NlMsgId> because there are also other before the not listed text.
I've tried star and a lot of other things with no success 😞

Also does someone has some hints where to best start so I get more familar with those regular expressions?
Thanks a lot and cheers

Markus

Tags (2)

to4kawa
Ultra Champion
| makeresults 
| eval _raw="<?xml version=\"1.0\" encoding=\"UTF-8\"?> 
<PublishLog Version=\"5.0\"> <RequestReport> <Request>
<Id>Z00000hyjlq1l4Xpa3Z53MZbem7cZ</Id> <StartTime>1407386816620</StartTime>
<User>cli@ss001500.tauri.ch</User> <Type>ps_publish</Type>
<RequestTime>1407386816587</RequestTime> <RequestMsg/> <Description>cli for user cli@ss001500.tauri.ch</Description>
<ClientData/> <Result>FLR</Result> <EndTime>1407387275454</EndTime> </Request>
<Replies> <ReplyFirst> <Time>1407386816719</Time> <Result>ACK</Result>
<RequestId>Z00000hyjlq1l4Xpa3Z53MZbem7cZ</RequestId> </ReplyFirst>
<ReplyLast> <Time>1407387275454</Time> <Result>FLR</Result>
<ResultNlMsg> <NlMsgId>BMC-IPS000206I</NlMsgId> </ResultNlMsg>
</ReplyLast> </Replies> </RequestReport>"
| rex field=_raw "(?s).*FLR.*\<NlMsgId\>(?<BMCI>.*?)\<\/NlMsgId\>"

try option (?s) (PCRE_DOTALL)

If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.

cf PCRE Pattern Modifiers

echalex
Builder

Perhaps using xpath would help you? Quick testing with the following command yields results for me:

your_search |xpath outfield=NlMsgId "*/Replies/ReplyLast[Result="FLR"]/ResultNlMsg/NlMsgId"

Note, according to the documentation for xpath, you should need to escape the quotes surrounding FLR. However, escaping the quotes does not work for me, but the search included does.

Mind you, if your data is complete and well formed, you might benefit from using the complete path, rather than a path with an asterisk, as I have done.

HTH

0 Karma

tbasima1
Explorer

thanks for help and sorry for being a pain 🙂
The xpath was not working so I think that somewhere is a mistake.
I found also an article that the first line with version

<?xml version="1.0" encoding="UTF-8"?>

could make a problem. So I've added the first few lines of the search result on the top (I've edited starting question)

index=patrol sourcetype=pserverlog FLR CmdbId "" | xpath outfield=NlMsgId "*/Replies/ReplyLast[Result="FLR"]/ResultNlMsg/NlMsgId" | table _time NlMsgId

Got some results, but not the NlMsgId column, maybe this is a mistake?
Thanks
Markus

0 Karma

echalex
Builder

Hi tbasima1,

Firstly, you need an expression to match any character, including a newline. The dot does not match a newline by default, so you need alternation. Then, to remove everything up to the tag <NlMsgId>, you could use a zero-width look-ahead assertion, which checks for the text following your expression.

So try this:

 rex field=_raw ".*FLR</Result>(?:\n|.)*(?=<NlMsgId>)<NlMsgId>(?<BMCI>.*?)</NlMsgId>"
  • The expression (?:\n|.)* matches any sequence of characters, including a newline
  • The expression (?=<NlMsgId>) checks that the previous expression is followed by <NlMsgId>, without "eating up" the match, so it is left for the next expression to pick up

HTH!

echalex
Builder

Ok, I see. Without seeing all your data, I can't see a reason why it works for me and not for you. Perhaps you could benefit from using xpath rather than rex?

0 Karma

tbasima1
Explorer

sorry for the confusion. yes, there is much more data in the log. That was the reason that I've palced some ..... there. Tehb event that will be found could have 244 lines and the part that I've listed is included.

0 Karma

echalex
Builder

I mean that the data you've pasted into the question will not be matched by the search, as it does not contain <SmmPublishRollback>. Maybe you didn't past all the data into the question?

0 Karma

tbasima1
Explorer

Hi,
no, I've appended only the search part in front, place your rex line and appended teh tabel formatting. What do you mean with does not match the previous data?

0 Karma

echalex
Builder

Hi,
Do you have a different set of data now? I see your base search does not match the previous data.

0 Karma

tbasima1
Explorer

Hi echalex,

thanks a lot for your support.

I've tried several things but unfortunately it did not work 😞
This is my command string

index=patrol sourcetype=pserverlog FLR CmdbId "<SmmPublishRollback>" | rex field=_raw ".*FLR</Result>(?:\n|.)*(?=<NlMsgId>)<NlMsgId>(?<BMCI>.*?)</NlMsgId>" | table _time BMCI

Only the time column will be shown.

Any idea?
Thanks and cheers

Markus

0 Karma

echalex
Builder

It struck me that a look-behind assertion might also work, but it seems they have to be of fixed width. Not sure. In any case, it would make for a messier regex.

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...