Dear all,
hope to find here some help.
I've tried now several things including searching in the answers here but don't find the solution.
I've for example a log file that is structured like that
<?xml version="1.0" encoding="UTF-8"?>
....
<PublishLog Version="5.0">
<RequestReport>
<Request>
<Id>Z00000hyjlq1l4Xpa3Z53MZbem7cZ</Id>
<StartTime>1407386816620</StartTime>
<User>cli@ss001500.tauri.ch</User>
<Type>ps_publish</Type>
<RequestTime>1407386816587</RequestTime>
<RequestMsg/>
<Description>cli for user cli@ss001500.tauri.ch</Description>
<ClientData/>
<Result>FLR</Result>
<EndTime>1407387275454</EndTime>
</Request>
<Replies>
<ReplyFirst>
<Time>1407386816719</Time>
<Result>ACK</Result>
<RequestId>Z00000hyjlq1l4Xpa3Z53MZbem7cZ</RequestId>
</ReplyFirst>
<ReplyLast>
<Time>1407387275454</Time>
<Result>FLR</Result>
<ResultNlMsg>
<NlMsgId>BMC-IPS000206I</NlMsgId>
</ResultNlMsg>
</ReplyLast>
</Replies>
</RequestReport>
With the rex expression
`rex field=_raw ".*FLR</Result>\s+<EndTime>(?<EndTime>.*?)</EndTime>"`
I get the EndTime value. No problem.
But now I want to search to the first FLR and then to the <NlMsgId>
`rex field=_raw ".*FLR</Result>[WHATISMISSINGHERE??]<NlMsgId>(?<BMCI>.*?)</NlMsgId>"`
What I've to set for a regular expression that it leaves out the text between
`FLR</Result>`
and
`<NlMsgId>`
?
I can't search directly for the <NlMsgId>
because there are also other before the not listed text.
I've tried star and a lot of other things with no success 😞
Also does someone has some hints where to best start so I get more familar with those regular expressions?
Thanks a lot and cheers
Markus
| makeresults
| eval _raw="<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<PublishLog Version=\"5.0\"> <RequestReport> <Request>
<Id>Z00000hyjlq1l4Xpa3Z53MZbem7cZ</Id> <StartTime>1407386816620</StartTime>
<User>cli@ss001500.tauri.ch</User> <Type>ps_publish</Type>
<RequestTime>1407386816587</RequestTime> <RequestMsg/> <Description>cli for user cli@ss001500.tauri.ch</Description>
<ClientData/> <Result>FLR</Result> <EndTime>1407387275454</EndTime> </Request>
<Replies> <ReplyFirst> <Time>1407386816719</Time> <Result>ACK</Result>
<RequestId>Z00000hyjlq1l4Xpa3Z53MZbem7cZ</RequestId> </ReplyFirst>
<ReplyLast> <Time>1407387275454</Time> <Result>FLR</Result>
<ResultNlMsg> <NlMsgId>BMC-IPS000206I</NlMsgId> </ResultNlMsg>
</ReplyLast> </Replies> </RequestReport>"
| rex field=_raw "(?s).*FLR.*\<NlMsgId\>(?<BMCI>.*?)\<\/NlMsgId\>"
try option (?s) (PCRE_DOTALL)
If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.
Perhaps using xpath would help you? Quick testing with the following command yields results for me:
your_search |xpath outfield=NlMsgId "*/Replies/ReplyLast[Result="FLR"]/ResultNlMsg/NlMsgId"
Note, according to the documentation for xpath, you should need to escape the quotes surrounding FLR
. However, escaping the quotes does not work for me, but the search included does.
Mind you, if your data is complete and well formed, you might benefit from using the complete path, rather than a path with an asterisk, as I have done.
HTH
thanks for help and sorry for being a pain 🙂
The xpath was not working so I think that somewhere is a mistake.
I found also an article that the first line with version
<?xml version="1.0" encoding="UTF-8"?>
could make a problem. So I've added the first few lines of the search result on the top (I've edited starting question)
index=patrol sourcetype=pserverlog FLR CmdbId "
Got some results, but not the NlMsgId column, maybe this is a mistake?
Thanks
Markus
Hi tbasima1,
Firstly, you need an expression to match any character, including a newline. The dot does not match a newline by default, so you need alternation. Then, to remove everything up to the tag <NlMsgId>
, you could use a zero-width look-ahead assertion, which checks for the text following your expression.
So try this:
rex field=_raw ".*FLR</Result>(?:\n|.)*(?=<NlMsgId>)<NlMsgId>(?<BMCI>.*?)</NlMsgId>"
(?:\n|.)*
matches any sequence of characters, including a newline(?=<NlMsgId>)
checks that the previous expression is followed by <NlMsgId>
, without "eating up" the match, so it is left for the next expression to pick upHTH!
Ok, I see. Without seeing all your data, I can't see a reason why it works for me and not for you. Perhaps you could benefit from using xpath rather than rex?
sorry for the confusion. yes, there is much more data in the log. That was the reason that I've palced some ..... there. Tehb event that will be found could have 244 lines and the part that I've listed is included.
I mean that the data you've pasted into the question will not be matched by the search, as it does not contain <SmmPublishRollback>
. Maybe you didn't past all the data into the question?
Hi,
no, I've appended only the search part in front, place your rex line and appended teh tabel formatting. What do you mean with does not match the previous data?
Hi,
Do you have a different set of data now? I see your base search does not match the previous data.
Hi echalex,
thanks a lot for your support.
I've tried several things but unfortunately it did not work 😞
This is my command string
index=patrol sourcetype=pserverlog FLR CmdbId "<SmmPublishRollback>" | rex field=_raw ".*FLR</Result>(?:\n|.)*(?=<NlMsgId>)<NlMsgId>(?<BMCI>.*?)</NlMsgId>" | table _time BMCI
Only the time column will be shown.
Any idea?
Thanks and cheers
Markus
It struck me that a look-behind assertion might also work, but it seems they have to be of fixed width. Not sure. In any case, it would make for a messier regex.