Solved: Re: Split a String field to new rows and extract f...

kiru2992 · ‎09-22-2020

Hello Everyone!

Currently the result of my query is below:

Input:

id URL

101 https://......-28.../../..../..../..../12304

102 https://......-28.../../..../..../..../34569

https://......-02.../../..../..../..../8976

https://......-28.../..../..../741256

103 https://......-06.../..../..../..../5678

https://......-04.../../..../..../..../158930

I would like to have the output as below:

Output:

id URL fieldA fieldB

101 https://......-28.../../..../..../..../12304 28 12304

102 https://......-28.../../..../..../..../34569 28 34569

102 https://......-02.../../..../..../..../8976 02 8976

102 https://......-28.../..../..../741256 28 741256

103 https://......-06.../..../..../..../5678 06 5678

103 https://......-04.../../..../..../..../158930 04 158930

I have tried with rex and mvcombine , makemv but not able to achieve the result. I am not sure whether I am using them correctly.

Can you please help me to get the output?

ITWhisperer · ‎09-28-2020

The problem here is that mvexpand doesn't work with empty fields so the trick is to add a random string to the field (mvappend, mvcombine), then remove it if it is not the only value in the field (mvfilter), then replace it with an empty string after the mvexpand

... your search
| eval temp="randomstringwithoutwhitespaces"
| eval URL=mvappend(temp,URL)
| mvcombine delim=" " URL
| makemv tokenizer="(\S+)" URL 
| eval URL=if(mvcount(URL)>1,mvfilter(URL!="randomstringwithoutwhitespaces"),URL)
| mvexpand URL
| eval URL=if(URL="randomstringwithoutwhitespaces","",URL)
| rex field=URL "\/[^\d]*(-|)(?<fieldA>\d{2}).*\/(.*csv$|(?<fieldB>\d+$))"

View solution in original post

ITWhisperer · ‎09-22-2020

... your search
| eval URL=split(URL,"\n")
| mvexpand URL
| rex field=URL "-(?<fieldA>\d{2}).*\/(?<fieldB>\d+$)"

kiru2992 · ‎09-22-2020

Hello @ITWhisperer ,

I am sorry, the URL field turns out to be a string field where we have multiple links separated by space. I tried the split command with mvexpand and it is not working .

The rex command worked perfectly! If we are able to split the URL to separate links in the same field then I think we can extract the fieldA and fieldB.

Can you please let me how to split the split the string to new rows?

kiru2992 · ‎09-22-2020

It is a "\n" between links in the string.

ITWhisperer · ‎09-22-2020

Try this to split the URL string into multiple events

... your search
| makemv tokenizer="(\S+)" URL
| mvexpand URL
| rex field=URL "-(?<fieldA>\d{2}).*\/(?<fieldB>\d+$)"

kiru2992 · ‎09-28-2020

Hello @ITWhisperer ,

The above code works perfectly:) But I have certain rows where URL is "" and with the above code all the rows with URL as "" is removed.

Can you please let me know how extract new fields retaining all the existing rows?

kiru2992 · ‎09-28-2020

Hello @ITWhisperer

Can you also please help me to extract 'fieldA' in case the URL field is available in the below format?

URL fieldA

/opt/splunk-monitor/.../..../28-.../......csv 28

/opt/splunk-monitor/.../..../08-.../......csv 08

/opt/splunk-monitor/.../..../....-02/......csv 02

/opt/splunk-monitor/.../..../....-06/......csv 06

ITWhisperer · ‎09-28-2020

| rex max_match=0 field=URL "\/[^\d]*(-|)(?<fieldA>\d{2}).*\/(.*csv$|(?<fieldB>\d+$))"

kiru2992 · ‎09-28-2020

Hello @ITWhisperer ,

Thanks a lot.The extraction works very well:) But as I mentioned above, I have certain rows where URL is "" and with the below code all the rows with URL as "" is removed.

Can you please let me know how extract new fields retaining all the existing rows?

Code:

... your search
| makemv tokenizer="(\S+)" URL
| mvexpand URL
| rex field=URL "-(?<fieldA>\d{2}).*\/(?<fieldB>\d+$)"

ITWhisperer · ‎09-28-2020

The problem here is that mvexpand doesn't work with empty fields so the trick is to add a random string to the field (mvappend, mvcombine), then remove it if it is not the only value in the field (mvfilter), then replace it with an empty string after the mvexpand

... your search
| eval temp="randomstringwithoutwhitespaces"
| eval URL=mvappend(temp,URL)
| mvcombine delim=" " URL
| makemv tokenizer="(\S+)" URL 
| eval URL=if(mvcount(URL)>1,mvfilter(URL!="randomstringwithoutwhitespaces"),URL)
| mvexpand URL
| eval URL=if(URL="randomstringwithoutwhitespaces","",URL)
| rex field=URL "\/[^\d]*(-|)(?<fieldA>\d{2}).*\/(.*csv$|(?<fieldB>\d+$))"

kiru2992 · ‎09-29-2020

Hello @ITWhisperer ,

Thanks a lot:) This worked like magic:)

richgalloway · ‎09-22-2020

Please share your existing query.

For field extractions using rex, we need more details about the data from which the data will be extracted. That is to say the "..." may be hiding important characters that may determine how the regex must be written.

---
If this reply helps you, Karma would be appreciated.

kiru2992 · ‎09-22-2020

Hello @richgalloway ,

I am using the below query for rex now:

| rex max_match=0 field=URL"\/(?<fieldB>[^\/]*)(https:|\n|$)"

I hope this helps in modifing it to get fieldA as well.

New rows for each of the Extracted values from a multi-valued field

field extraction

regex

rex

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

.conf24 | Session Scheduler is Live!!

Introducing the Splunk Community Dashboard Challenge!