All Apps and Add-ons

Help with regex

lemikg
Communicator

Hi,

I extracted a field with Splunk Field Extractor which seemed to work until I noticed it didn't capture all messages (i.e. CSRF Attack Detected - Missing CSRF Token) from ModSecurity.

Here some Log msg:

--f7d234hc-H--
Message: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/cut/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."]
Message: Failed to write to DBM file "/tmp/global": Invalid argument
Apache-Handler: perl-script
--f7d3t15d-Z--

This is what the app gave me

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.]+)\"\]

Is there something wrong with it? Can it be done more efficiently?

Thanks in advance.

Cheers
Mike

Tags (3)
0 Karma
1 Solution

dmr195
Communicator

I think it's because there's a hyphen missing inside the innermost square brackets. Try:

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.-]+)\"\]

instead. (In case it's hard to see, the difference is 8 characters from the end.)

Your previous regex was only looking for letters, numbers, underscores, whitespace, slashes and dots between the double quotes. Hence it didn't match because "CSRF Attack Detected - Missing CSRF Token" has a hyphen in the middle.

View solution in original post

bjoernjensen
Contributor

Hi,

here are more things to be considered:

(a) it seams that the message does not start with a hex-coded ID in hyphens and that "H"
(b) you aren't getting the whole message text if it contains a hyphen

Something like this should work:
(?s)--[0-9a-z]+-[A-Z]--\n.*\[msg \"(?P<msg>[-\w\s\/.]+)\"\]

dmr195
Communicator

I feel a little guilty that my answer was accepted here, as I missed the first required change. The regex in this answer is the one to use.

0 Karma

lemikg
Communicator

thanks to you, too. I tried that as well and worked. have a great one.
cheers
Mike

0 Karma

dmr195
Communicator

I think it's because there's a hyphen missing inside the innermost square brackets. Try:

(?s)--[0-9a-f]+-H--\n.*\[msg \"(?P<msg>[\w\s\/.-]+)\"\]

instead. (In case it's hard to see, the difference is 8 characters from the end.)

Your previous regex was only looking for letters, numbers, underscores, whitespace, slashes and dots between the double quotes. Hence it didn't match because "CSRF Attack Detected - Missing CSRF Token" has a hyphen in the middle.

lemikg
Communicator

It seems, that did the trick. Thank you very much.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...