Getting Data In

Yet another filtering problem. Many transforms in transforms.conf not filtering

neusse
Path Finder

I am trying to filter with many transform statements. I believe everything is configured correctly. But I get ALL events indexed. None are going to the nullQueue. Please help! It seems easy enough but I am not getting this.

As I understand it:

In my props.conf I have two sections. One is [mod_security] it does a few things for parsing the event and collects some fields.

Also in my props.conf I have a [source::/some/path/to/directory/]. This has all the transform statements that are checking if it should send the event to the nullQueue.

What happens is all events are indexed. None are filtered. I have also tried adding all the transform statements to the [mod_security] section. But I get the same, everything is indexed, issue.

Last I wanted to check that a couple of strings exist in the event after filtering. If they do then index the event. If not send the event to the nullQueue.

Here are my props.conf and transforms.conf. Any help would be appreciated.


Props.conf

[mod_security]
TRUNCATE = 0 
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = (--[a-z0-9]+-A--)
REPORT-get = get
REPORT-post = post
REPORT-severity = severity


[source::/nas/log/cache/httpd-www-80/]
TRANSFORMS-f1  = f1
TRANSFORMS-f2  = f2
TRANSFORMS-f3  = f3
TRANSFORMS-f4  = f4
TRANSFORMS-f5  = f5
TRANSFORMS-f6  = f6
TRANSFORMS-ok = null,ok

=========================================
transforms.conf


[get]
REGEX = (GET.+?)$
FORMAT = AA_get::$1

[post]
REGEX = (POST.+?)$
FORMAT = AA_post::$1

[severity]
REGEX = (severity.+?)\]
FORMAT = AA_severity::$1

[f1]
REGEX = (99\.99\.99\.38)
DEST_KEY = queue
FORMAT = nullQueue

[f2]
REGEX = .*404\sNot\sFound.*
DEST_KEY = queue
FORMAT = nullQueue

[f3]
REGEX = .*401\sUnauthorized.*
DEST_KEY = queue
FORMAT = nullQueue

[f4]
REGEX = .*500\sInternal\sServer\sError.*
DEST_KEY = queue
FORMAT = nullQueue

[f5]
REGEX = .*403\sForbidden.*
DEST_KEY = queue
FORMAT = nullQueue

[f6]
REGEX = FILTERED\sTO\s
DEST_KEY = queue
FORMAT = nullQueue

[null]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[ok]
REGEX = (Pattern\smatch)|(Matched\ssignature)
DEST_KEY = queue
FORMAT = indexQueue
 
Tags (1)
0 Karma
1 Solution

neusse
Path Finder

Yes there is a solution. I found nothing anywhere describing this in splunk.com.

My issue of not being able to search into the event deep enough at index time was solved by using the simple command LOOKAHEAD in transforms.conf. Turns out splunk does not look far ahead related to REGEX at all when indexing. It seems to only be looking for the end ofthe transaction as a priority.

Here are my working props.conf and transforms.conf:


transforms.conf

[nomore]
LOOKAHEAD = 100000
REGEX=(?m)(404\sNot\sFound)
DEST_KEY=queue
FORMAT=nullQueue


props.conf

[mod_security]
SHOULD_LINEMERGE = true
MUST_NOT_BREAK_AFTER = (--[a-z0-9]+-A--)
MUST_BREAK_AFTER = (--[a-z0-9]+-Z--)
TRUNCATE = 0
TRANSFORMS-notfounderror = nomore

At least it is working now!

View solution in original post

neusse
Path Finder

Yes there is a solution. I found nothing anywhere describing this in splunk.com.

My issue of not being able to search into the event deep enough at index time was solved by using the simple command LOOKAHEAD in transforms.conf. Turns out splunk does not look far ahead related to REGEX at all when indexing. It seems to only be looking for the end ofthe transaction as a priority.

Here are my working props.conf and transforms.conf:


transforms.conf

[nomore]
LOOKAHEAD = 100000
REGEX=(?m)(404\sNot\sFound)
DEST_KEY=queue
FORMAT=nullQueue


props.conf

[mod_security]
SHOULD_LINEMERGE = true
MUST_NOT_BREAK_AFTER = (--[a-z0-9]+-A--)
MUST_BREAK_AFTER = (--[a-z0-9]+-Z--)
TRUNCATE = 0
TRANSFORMS-notfounderror = nomore

At least it is working now!

gkanapathy
Splunk Employee
Splunk Employee

You probably need:

[source::/nas/log/cache/httpd-www-80/*]

instead of

[source::/nas/log/cache/httpd-www-80/]

neusse
Path Finder

I made the change and it males no deference. I still get ALL events indexed. It seems this is a wide spread problem for folks, It would be nice if splunk had a little better method for filtering since there is a lot of noise in many logs.

I am at an impasse with this. For the volume of data, we need to filter to make better use of analysis time and system resources.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...