Splunk Search

Trouble searching for multiple values using rex

lain179
Communicator

I am having trouble searching mutliple patterns using rex. I have the log files containg the following pattern lines:

  • BLAH BLAH BLAH, Processtype : <12345> BLAH BLAH BLAH.

I want to get table report on Processtype and the docCount, which in this example case is 12345.

I can search for PROCESSTYPE

sourcetype="SOMETHING SOMETHING" | rex field=_raw ".*, (?<1ST>[A-Z][a-z]+) :.\*"

I can also search for docCount

sourcetype="SOMETHING SOMETHING" | rex field=_raw ".* \<(?<2ND>[0-9]+)\>"

But when I combine the two together like this, it doesn't return any result

 sourcetype="SOMETHING SOMETHING" | rex field=_raw ".*, (?<1ST>[A-Z][a-z]+) : \<(?<2ND>[0-9]+)\>"

What am I doing wrong?

Tags (1)
0 Karma
1 Solution

Rob
Splunk Employee
Splunk Employee

I have found that there can occasionally be a problem with colon's in a regex probably due to EREG syntax compatibility. While I agree that regex builders on various web pages can be useful, they are rarely fully functional for complex regex building.

You may want to try escaping all non-alphanumeric characters in your regex to be on the safe side. I also find that sometimes whitespace can be irregular so that would be the second recommendation I would give you. Taking your example as a base, you may want to try the following regex:

sourcetype="SOMETHING SOMETHING" | rex field=_raw ".*,\s+(?<1ST>[A-Z][a-z]+)\s+\:\s+\<(?<2ND>[0-9]+)\>"

View solution in original post

0 Karma

Rob
Splunk Employee
Splunk Employee

I have found that there can occasionally be a problem with colon's in a regex probably due to EREG syntax compatibility. While I agree that regex builders on various web pages can be useful, they are rarely fully functional for complex regex building.

You may want to try escaping all non-alphanumeric characters in your regex to be on the safe side. I also find that sometimes whitespace can be irregular so that would be the second recommendation I would give you. Taking your example as a base, you may want to try the following regex:

sourcetype="SOMETHING SOMETHING" | rex field=_raw ".*,\s+(?<1ST>[A-Z][a-z]+)\s+\:\s+\<(?<2ND>[0-9]+)\>"
0 Karma

Rob
Splunk Employee
Splunk Employee

Glad to hear that you found what helps. What is the regex that ended up resolving this for you?

0 Karma

lain179
Communicator

I believe it was because the greediness of the regex search. Excluding [^ ] a whole bunch of characters seem to have helped.

0 Karma

Gilberto_Castil
Splunk Employee
Splunk Employee

Hey, as far as I can tell there is nothing technically incorrect with the regex pattern itself. Having dealt with a similar situation in the past, I would have to ask if you are using capture field names which actually begin with a digit. I recall reading somewhere, a long time ago, about acceptable naming for capture groups and something that led me to change the name of the capture. Sorry I cannot link it but it was a long time ago and a fluke.

More explicitly, are you using 1ST and 2ND, literally? If that were the case, then you will have to use a name which begins with a permissible consonant character. Meaning the first character in your captuer field should be a-z or A-Z.

By the way, using your regex, quoted above, fails with RegEx Builder, Regex Buddy and RegExr. If I alter the regex just slightly to

.*, (?<first>[A-Z][a-z]+) : <(?<second>[0-9]+)>

then it works O.K. The same is true in Splunk.

I hope this helps.

0 Karma

lain179
Communicator

Thank you for confirming. My thought is that, if my regex is wrong, I shouldn't get any result back when I search for each separately. I get correct words when I search those one at a time. I just have problem when I combine them together.

0 Karma

sowings
Splunk Employee
Splunk Employee

No, there's no restriction in the config. Consider checking out one of the regex sites listed above to validate your regex vs. the data. My personal favorite is RegExr.

0 Karma

lain179
Communicator

Hi,

My rex syntax got alter in the message posting. I don't know how to post as a code. My variables don't start with numbers and my sytax is exactly like what you posted.

Is there something to do with the configuration that I can't create more than one variable in rex search?

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...