Splunk Search

How to modify my regular expression to extract strings between two pipes?

maximusdm
Communicator

hello, I need to extract the strings between both pipes " | | ", for instance, here are a few sample strings:
(sometimes we have a pipe: " I " and sometimes we have a uppercase letter " i" )

ASDSAD ASDASD ASDAS | STRING001 | ASDA ASDASD ASDASDADADA
ASDSAD ASDASD ASDAS I STRING002 I ASDA ASDASD ASDASDADADA

My regular expression works 90% of time:

| rex field="Site Section" ".*\|\s*(?<SiteSection>.*)\s*\|"   
| rex field="Site Section" ".*\I\s*(?<SiteSection>.*)\s*\I"  
| rex field="Site Section" ".*\I\s*(?<SiteSection>.*)\s*\|" 
| rex field="Site Section" ".*\|\s*(?<SiteSection>.*)\s*\I" 

However it does not work for the strings below:
ASDASD ASDASDASDA ADASDADAD I AMC I IFC <=== returns empty
(most likely because of "IFC" string contains a uppercase letter "i")

ASDASD ASDASDASDA ADASDADAD I DISCO I ADASDA <== returns "ISCO"
(most likely because of "IFC" string contains a uppercase letter "i")

Any ideas how to modify my regular expression?
Thanks

Tags (1)
0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try

Updated

your base search | rex field="Site Section" "\s(\||I)\s+(?<SiteSection>.+)\s+(\||I)\s" 

View solution in original post

0 Karma

gokadroid
Motivator

If still required, can you check this one which shall work in most of the cases:

your query to return events
| rex field=_raw"\s*(\s*\|\s*(?<captureMe>[^\|]+)\|\s*)"
| table captureMe

See extraction here

0 Karma

somesoni2
Revered Legend

Give this a try

Updated

your base search | rex field="Site Section" "\s(\||I)\s+(?<SiteSection>.+)\s+(\||I)\s" 
0 Karma

maximusdm
Communicator

it is a lot better but still if I have a letter uppercase " i " after the second pipe " | " then it doesnt work properly. Thanks

0 Karma

somesoni2
Revered Legend

A sample log where it's failing?

0 Karma

maximusdm
Communicator

if you have a string such as: ABCDE I AAA I IFC the results will be "AAA I" and not "AAA" as it should be.

0 Karma

somesoni2
Revered Legend

The value/string that you want to capture, will it always be a single word or can be multiple words?
Try the updated answer as well.

0 Karma

maximusdm
Communicator

with your update I only had one string which failed and it is because there is no space between the pipe "|" and the letter "i", for instance:
AASSDDF DFGJKJ | A&E |FYI will return nothing.

PS: strings with 2 words between the pipes work just fine!

0 Karma

somesoni2
Revered Legend

How about this?

your base search | rex field="Site Section" "\s(\||I)\s+(?<SiteSection>.+)\s+(\||I\s)" 
0 Karma

maximusdm
Communicator

now it fails when there are no spaces between the first pipe LOL
for instance:
ASDF ASDF| A&E |FYI or
ASDF ASDF |A&E |FYI

0 Karma

maximusdm
Communicator

This resolved my problem by replacing the " i " with pipes before the next reg.exp.

| rex field="Site Section" mode=sed "s,\sI\s, | ,g"
| rex field="Site Section" ".|\s(?.)\s|"

I want to thank you for pointing me to the right direction.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...