Splunk Search

How to edit my regex to extract a variable string that may have either dashes or spaces?

ahogbin
Communicator

Hello,

I am trying to put together a regex to extract a string. The issue I have is that the string sometimes contains dashes as a seperator
as in 11-23345-6778-CMP and sometimes there is simply a space 11 23345 8897 CMP.

I have a regular expression that extracts the string with the dashes, but I am struggling to work out how to also ask the same expression to extract strings that have a space instead.

Is it even possible to combine the two ?

The expression I have is:

rex "(?i)\\|.*?\\|(?P<POLICYNUMBERS>\\d+\\-[a-f0-9]+\\-\\w+)"

Any help or advice is as always greatly appreciated.

Cheers,

Alastair

Tags (4)
0 Karma
1 Solution

jeffland
SplunkTrust
SplunkTrust

First of all, please post your regexes as code, otherwise the markup will mess them up.

There are usually a few ways to get there with regex, also this time. You could set up alternatives to your dashes with |, but you can also just use a less precise item such as . to capture either dash or whitespace in that position.

Loosely based on your original regex, it could look something like:

(?<POLICYNUMBERS>\d{2}.\d{5}.\d{4}.\w{3})

And lastly, you should use a tool like https://regex101.com/ to help you with any regex matters 🙂

View solution in original post

jeffland
SplunkTrust
SplunkTrust

First of all, please post your regexes as code, otherwise the markup will mess them up.

There are usually a few ways to get there with regex, also this time. You could set up alternatives to your dashes with |, but you can also just use a less precise item such as . to capture either dash or whitespace in that position.

Loosely based on your original regex, it could look something like:

(?<POLICYNUMBERS>\d{2}.\d{5}.\d{4}.\w{3})

And lastly, you should use a tool like https://regex101.com/ to help you with any regex matters 🙂

ahogbin
Communicator

This works.. however the format of the extracted string is not always the same. For example:
1-85-F792378
87-F833763-CMP
1 45 122434

I have attempted to use wildcards in the regex but to no avail and despite the explanation provided in regex 101 looking correct I am unable to extract the required information.

All rather frustrating and my severely limited knowledge of regex is not helping 😉

Cheers,

Alastair

0 Karma

jeffland
SplunkTrust
SplunkTrust

We can get there using other means as well... for example, does the string have only the three variants you just posted, i.e. can we work with the number of characters possible in each position? Then something like this could work:

(?<POLICYNUMBERS>(?:\d(?:\s|\-)\d{2}(?:\s|\-)\w+|\d{2}\-\w{7}\-\w{3}))

Alternatively, the idea could be adjusted to respect some variation. This one for example reads elements of one or two digits, then one to seven and one to seven characters and accepts a whitespace or a dash between them:

(?<POLICYNUMBERS>\d{1,3}(?:\s|\-)\w{1,7}(?:\s|\-)\w{1,7})

Be careful with this as it may also match other data as well.
Or is your string uniquely identifyable based on what comes before and/or after it, i.e. does you data look like

[beginning of line]foo 1-85-F792378 some_identifier=x
[beginning of line]foo 87-F833763-CMP some_identifier=y
[beginning of line]foo 1 45 122434 some_identifier=z

Because then we could capture everything based on the place it is with something like

^foo\s(?<POLICYNUMBERS>[^(\ssome\_identifier)]+)
0 Karma

ahogbin
Communicator

Hello,
The first example worked a treat as the possible number / letter combination is limited to the three string variants.
Thank you so much for your help it really is appreciated.
Cheers,

Alastair

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...