Splunk Search

Help with REGEX for data filter

nce054
Path Finder

In my transforms.conf I currently have

[filter-marimba]
REGEX=^(?!\[[^\]]+\]\s+-\s+warning.*)
DEST_KEY = queue
FORMAT = nullQueue

It should be catching any message that doesn't have 'warning' in it.
This is data it should be allowing

[19/Jun/2015:09:07:36 -0500] - warning nce054_a 50012 Common Reboot Service is disabled.

This is some data it should be sending to the null queue

[19/Jun/2015:09:07:42 -0500] - info nce054_a 9236 Adapter stopped, packaged channel URL: http://mrbamtx:5282/Root/Prod/DeskMgmt/PrintQMigration

Any advice?

Tags (2)
0 Karma
1 Solution

woodcock
Esteemed Legend

I just tested this configuration and it DOES work:

[filter-marimba]
REGEX=^\[[^\]]+\]\s+-\s+(?!warning).*
DEST_KEY = queue
FORMAT = nullQueue

It should be trashing any message that doesn't have 'warning' in that particular spot.

View solution in original post

woodcock
Esteemed Legend

I just tested this configuration and it DOES work:

[filter-marimba]
REGEX=^\[[^\]]+\]\s+-\s+(?!warning).*
DEST_KEY = queue
FORMAT = nullQueue

It should be trashing any message that doesn't have 'warning' in that particular spot.

ccraig42
Engager

Are you saying that regular expression is sending the one with "warning" to the null queue or not sending the other one? I would have written a match to send '^[[^]]+]\s+-\s+warning' to a real queue instead of using a negative lookahead assertion, and then everything else to the nullQueue, but the regex you have should work on those example lines.

0 Karma

woodcock
Esteemed Legend

It takes 2 steps to do it that way which is why I showed you a 1-step design that throws away what matches.

0 Karma

ccraig42
Engager

Sorry, I'm not trying to say it's a bad regex, that's exactly my problem. It's not what I would have used, but it should work with the example data he has, which makes it rather difficult to produce a regular expression that works on the data that's failing, but isn't provided.

0 Karma

woodcock
Esteemed Legend

Is this still true?

inputs.conf

[monitor://C:\Windows\.marimba\MarimbaEndpointTuner\history-y*.log]
sourcetype = marimba

props.conf

[marimba]
TRANSFORMS-mfilter=filter-marimba
0 Karma

rkent
Explorer

This regex uses a negative lookahead to match all lines (in the same format as the samples you've provided) to not match any event that has either the words warning or info in them:

^(?:[[^]]+]\s+-)\s+(?!warning).$
^(?:[[^]]+]\s+-)\s+(?!info).
$

You can test the first of the regexes at the following link: http://regexr.com/3b8m7

To test, remove one of the letters from the word "warning" and you will see it instantly match the sample event.

0 Karma

nce054
Path Finder

This REGEX=^# is essentially saying that any input string that matches this (anything that starts with #) should be sent to the nullQueue, i.e. not sent to Indexer. Is that correct? When I put the expression ^# into https://regex101.com/ and insert one of my pieces of data, like `#[19/Jun/2015:09:31:29 -0500]

ret = 0` , it says that there is a match, but no groups were extracted.

0 Karma

woodcock
Esteemed Legend

ARGH! Then your example data (admittedly from previous question) was not written correctly by you!
Try this one:

REGEX= \]\s+#
0 Karma

nce054
Path Finder

I pushed this REGEX out in the transforms.conf at 9:46, and I still found this at 9:52 on the Search Head :
#[19/Jun/2015:09:52:42 -0500]
#ret = 0

0 Karma

woodcock
Esteemed Legend

Is there really a leading '#' character? You really have to make up your mind about what your data looks like! In any case, I am worn out. I have used this basic configuration a dozen time and it works great so I am convinced the problem is the RegEx but I am worn out and will let somebody else comment.

0 Karma

nce054
Path Finder

Yes, there is really a leading '#' character. Thanks for the effort.

0 Karma

nce054
Path Finder

My apologies, I messed up on that one. The basic point of this is trying to remove events where the only content is comments. Sorry for the confusion. According to https://regex101.com , that pattern wouldn't match the junk data. If I'm wrong, please let me know. My experience with REGEX is extremely limited. Thanks for your time.

0 Karma

woodcock
Esteemed Legend

In this use-case, the manner of RegEx is not anchored but your tool must be. To test with your tool, add .* to the front and back ends of the RegEx, like this: .*\]\s+#.*

0 Karma

nce054
Path Finder

Yes, both of those are the same. Sorry for starting a new thread, I felt like it was getting congested. Trying the REGEX = ^# right now.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...