Splunk Search

How to get the required text from rex?

saitejagayala
New Member

Hello,
I want to extract only the required text from Logs using rex.

for instance,
consider in logs there is some data in tags i.e

<ID> 100034566 </ID> <data> This consists of DB data </data> <date> the date is 04-03-2019 </data>..........etc

The regular expression which I am using is

  index = * | rex field=Msg "<data>(?<error>.*)" | table error

The output which I am getting is

error
This consists of DB data </data> <date> the date is 04-03-2019 </date>..........etc

What I need is only the data which is present in tag . i.e

REQUIRED OUTPUT

 error
This consists of DB data

But, The data which is suffix to that is also getting displayed, which I don't need.

Can anyone help me out in this?

0 Karma
1 Solution

harsmarvania57
Ultra Champion

Hi,

Please try below regex, that regex will extract output in new field called ext_data

<yourBaseSearch>
| rex field=_raw "\<data\>\s?(?<ext_data>[^\<]*)"

EDIT: Updated regex because I found space after <data>

View solution in original post

0 Karma

FrankVl
Ultra Champion

This should probably work:

| rex field=Msg "\<data\>(?<error>[^<]+)"

https://regex101.com/r/tpYcTu/1
If your data indeed contains whitespace around the tags, you can strip that off using | eval data=trim(data) after the rex command (can also be done by using a more complex regex).

0 Karma

harsmarvania57
Ultra Champion

Hi,

Please try below regex, that regex will extract output in new field called ext_data

<yourBaseSearch>
| rex field=_raw "\<data\>\s?(?<ext_data>[^\<]*)"

EDIT: Updated regex because I found space after <data>

0 Karma

saitejagayala
New Member

Hi @harsmarvania57
Can you elaborate and explain the rex which you wrote?

0 Karma

harsmarvania57
Ultra Champion

Yes, I'll try my best to explain, from regex

  1. \<data\> is literally matching <data> from your raw data
  2. \s? will find white space after <data> for zero or one time
  3. (?<ext_data>[^\<]*) will find all character before < and store that extracted data in new field called ext_data
0 Karma

FrankVl
Ultra Champion

Please post your current regex also as code (like you did with the sample data). Otherwise some special characters disappear.

0 Karma

FrankVl
Ultra Champion

Thanks for editing your question, the reason you're getting everything after the data tag, is because you use .*, which matches anything. Have a look at the answers below for more strict regular expressions that stop at the < character.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...