Splunk Search

How to modify my regular expression to extract strings between two pipes?

maximusdm
Communicator

hello, I need to extract the strings between both pipes " | | ", for instance, here are a few sample strings:
(sometimes we have a pipe: " I " and sometimes we have a uppercase letter " i" )

ASDSAD ASDASD ASDAS | STRING001 | ASDA ASDASD ASDASDADADA
ASDSAD ASDASD ASDAS I STRING002 I ASDA ASDASD ASDASDADADA

My regular expression works 90% of time:

| rex field="Site Section" ".*\|\s*(?<SiteSection>.*)\s*\|"   
| rex field="Site Section" ".*\I\s*(?<SiteSection>.*)\s*\I"  
| rex field="Site Section" ".*\I\s*(?<SiteSection>.*)\s*\|" 
| rex field="Site Section" ".*\|\s*(?<SiteSection>.*)\s*\I" 

However it does not work for the strings below:
ASDASD ASDASDASDA ADASDADAD I AMC I IFC <=== returns empty
(most likely because of "IFC" string contains a uppercase letter "i")

ASDASD ASDASDASDA ADASDADAD I DISCO I ADASDA <== returns "ISCO"
(most likely because of "IFC" string contains a uppercase letter "i")

Any ideas how to modify my regular expression?
Thanks

Tags (1)
0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try

Updated

your base search | rex field="Site Section" "\s(\||I)\s+(?<SiteSection>.+)\s+(\||I)\s" 

View solution in original post

0 Karma

gokadroid
Motivator

If still required, can you check this one which shall work in most of the cases:

your query to return events
| rex field=_raw"\s*(\s*\|\s*(?<captureMe>[^\|]+)\|\s*)"
| table captureMe

See extraction here

0 Karma

somesoni2
Revered Legend

Give this a try

Updated

your base search | rex field="Site Section" "\s(\||I)\s+(?<SiteSection>.+)\s+(\||I)\s" 
0 Karma

maximusdm
Communicator

it is a lot better but still if I have a letter uppercase " i " after the second pipe " | " then it doesnt work properly. Thanks

0 Karma

somesoni2
Revered Legend

A sample log where it's failing?

0 Karma

maximusdm
Communicator

if you have a string such as: ABCDE I AAA I IFC the results will be "AAA I" and not "AAA" as it should be.

0 Karma

somesoni2
Revered Legend

The value/string that you want to capture, will it always be a single word or can be multiple words?
Try the updated answer as well.

0 Karma

maximusdm
Communicator

with your update I only had one string which failed and it is because there is no space between the pipe "|" and the letter "i", for instance:
AASSDDF DFGJKJ | A&E |FYI will return nothing.

PS: strings with 2 words between the pipes work just fine!

0 Karma

somesoni2
Revered Legend

How about this?

your base search | rex field="Site Section" "\s(\||I)\s+(?<SiteSection>.+)\s+(\||I\s)" 
0 Karma

maximusdm
Communicator

now it fails when there are no spaces between the first pipe LOL
for instance:
ASDF ASDF| A&E |FYI or
ASDF ASDF |A&E |FYI

0 Karma

maximusdm
Communicator

This resolved my problem by replacing the " i " with pipes before the next reg.exp.

| rex field="Site Section" mode=sed "s,\sI\s, | ,g"
| rex field="Site Section" ".|\s(?.)\s|"

I want to thank you for pointing me to the right direction.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...