Splunk Search

How do i write regex to extract all the numbers in a string

sameeripro
Path Finder

i need to extract all the numbers in the below string. I am using "(?\d+[0-9])" but its not extracting zeros and i am getting only 53 as the answer

0.0.0.53.IN-BRR.CRPD 1.0.0.127.icmpbugtest.157.kn-ddr.prpd

I have to get an answer : 00053100127157

0 Karma
1 Solution

dineshraj9
Builder

If you want only the digits in the raw event remove everything else, then instead of extracting simply use replace -

<base search> |  eval digits=replace(_raw,"\D","") | table digits

View solution in original post

woodcock
Esteemed Legend

Like this:

| eval digits=_raw | rex field=digits mode=sed "s/\D//g"

sameeripro
Path Finder

@woodcock Even this query worked thank you very much @woodcock
| eval digits=_raw | rex field=digits mode=sed "s/\D//g"

0 Karma

sameeripro
Path Finder

@woodcock i am seeing mode=sed "s/\D//g" type of regex for first time can you throw some light on this as i want to learn.

0 Karma

woodcock
Esteemed Legend

There is a unix tool called sed which uses some RegEx-based syntax but has other peculiarities special to it. It is particularly useful when it is necessary to strip characters or re-arrange susbstrings. This says to use sed instead of normal RegEx when applying the command string to the field to modify.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

This is in the search manual, under search command rex, look for mode=sed.

Go to regex101.com for testing any particular regular expression, or go to these sites to learn about them. I would suggest the first one first, although it has a number of places where it will tell you about multiple different "flavors" or "dialects", as opposed to being splunk-specific.

http://www.regular-expressions.info/reference.html
https://regexone.com/

The other thing to be aware of is that sometimes you will have to escape (put a slash in front of) a character in splunk in order that the splunk processor will correctly interpret the regular expression, and it takes a little bit of familiarity to know when to add extra slashes because splunk is going to do multiple passes on the regex string. Don't worry about that too much, just get your feet wet and ask for help when you have a specific question.

0 Karma

dineshraj9
Builder

You can go to this site- http://regex101.com/ to learn more about regular expressions and test them.

\d token stands for any digit
\D stands for any non digit

More on sed expression here - http://docs.splunk.com/Documentation/Splunk/6.6.1/SearchReference/Rex#Sed_expression

0 Karma

dineshraj9
Builder

If you want only the digits in the raw event remove everything else, then instead of extracting simply use replace -

<base search> |  eval digits=replace(_raw,"\D","") | table digits

jkat54
SplunkTrust
SplunkTrust

Just like this but you might prefer to replace _raw with fieldName that contains the string.

sameeripro
Path Finder

@jkat54 Thank you very much

0 Karma

sameeripro
Path Finder

this worked as i wanted to extract numbers only from a single field that is "query" i used | eval digits=replace(query,"\D","") | table digits

DalJeanis
SplunkTrust
SplunkTrust

@dineshraj9 - great answer!

@jkat54 - It's a kind of weird string, so I'd expect that it probably is the entire _raw, but you were right that optionally changing the source field was worth mentioning.

0 Karma

jkat54
SplunkTrust
SplunkTrust

Yeah i didnt see that space in the string... its possible this is two fields, etc. So the _raw approach is probably the best.

0 Karma

inventsekar
Ultra Champion

well, check this one -
your base search | rex field=_raw "(?<YourNumber>\d+)$" | table YourNumber

sameeripro
Path Finder

in rex field=_raw "(?\d+)$" iam getting only the last part

Example

10.0.0.1 i am getting 1
123example567.com i am getting 567

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

I downvoted this post because removing my accidental upvote

0 Karma

jkat54
SplunkTrust
SplunkTrust

To un-upvote just click on the up arrow again. It should turn grey.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

That will extract the first set of consecutive digits in _raw, which in this example would be a single 0 character.

To make a correct extraction, add max_match=0, then use mvjoin with an empty string as the separator value to concatenate the multivalue fields together into a single string. dineshraj9's method is more elegant.

your base search | rex field=_raw "(?<YourNumber>\d+)$"  max_match=0  | eval YourNumber=mvjoin(YourNumber,"") | table YourNumber
0 Karma

sameeripro
Path Finder

In this query i am getting results only if the string ends with a number.

Example
for 10.0.0.1 i am getting 1
for 123example567.com i am getting zero results

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...