Getting Data In

Yet another filtering problem. Many transforms in transforms.conf not filtering

neusse
Path Finder

I am trying to filter with many transform statements. I believe everything is configured correctly. But I get ALL events indexed. None are going to the nullQueue. Please help! It seems easy enough but I am not getting this.

As I understand it:

In my props.conf I have two sections. One is [mod_security] it does a few things for parsing the event and collects some fields.

Also in my props.conf I have a [source::/some/path/to/directory/]. This has all the transform statements that are checking if it should send the event to the nullQueue.

What happens is all events are indexed. None are filtered. I have also tried adding all the transform statements to the [mod_security] section. But I get the same, everything is indexed, issue.

Last I wanted to check that a couple of strings exist in the event after filtering. If they do then index the event. If not send the event to the nullQueue.

Here are my props.conf and transforms.conf. Any help would be appreciated.


Props.conf

[mod_security]
TRUNCATE = 0 
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = (--[a-z0-9]+-A--)
REPORT-get = get
REPORT-post = post
REPORT-severity = severity


[source::/nas/log/cache/httpd-www-80/]
TRANSFORMS-f1  = f1
TRANSFORMS-f2  = f2
TRANSFORMS-f3  = f3
TRANSFORMS-f4  = f4
TRANSFORMS-f5  = f5
TRANSFORMS-f6  = f6
TRANSFORMS-ok = null,ok

=========================================
transforms.conf


[get]
REGEX = (GET.+?)$
FORMAT = AA_get::$1

[post]
REGEX = (POST.+?)$
FORMAT = AA_post::$1

[severity]
REGEX = (severity.+?)\]
FORMAT = AA_severity::$1

[f1]
REGEX = (99\.99\.99\.38)
DEST_KEY = queue
FORMAT = nullQueue

[f2]
REGEX = .*404\sNot\sFound.*
DEST_KEY = queue
FORMAT = nullQueue

[f3]
REGEX = .*401\sUnauthorized.*
DEST_KEY = queue
FORMAT = nullQueue

[f4]
REGEX = .*500\sInternal\sServer\sError.*
DEST_KEY = queue
FORMAT = nullQueue

[f5]
REGEX = .*403\sForbidden.*
DEST_KEY = queue
FORMAT = nullQueue

[f6]
REGEX = FILTERED\sTO\s
DEST_KEY = queue
FORMAT = nullQueue

[null]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[ok]
REGEX = (Pattern\smatch)|(Matched\ssignature)
DEST_KEY = queue
FORMAT = indexQueue
 
Tags (1)
0 Karma
1 Solution

neusse
Path Finder

Yes there is a solution. I found nothing anywhere describing this in splunk.com.

My issue of not being able to search into the event deep enough at index time was solved by using the simple command LOOKAHEAD in transforms.conf. Turns out splunk does not look far ahead related to REGEX at all when indexing. It seems to only be looking for the end ofthe transaction as a priority.

Here are my working props.conf and transforms.conf:


transforms.conf

[nomore]
LOOKAHEAD = 100000
REGEX=(?m)(404\sNot\sFound)
DEST_KEY=queue
FORMAT=nullQueue


props.conf

[mod_security]
SHOULD_LINEMERGE = true
MUST_NOT_BREAK_AFTER = (--[a-z0-9]+-A--)
MUST_BREAK_AFTER = (--[a-z0-9]+-Z--)
TRUNCATE = 0
TRANSFORMS-notfounderror = nomore

At least it is working now!

View solution in original post

neusse
Path Finder

Yes there is a solution. I found nothing anywhere describing this in splunk.com.

My issue of not being able to search into the event deep enough at index time was solved by using the simple command LOOKAHEAD in transforms.conf. Turns out splunk does not look far ahead related to REGEX at all when indexing. It seems to only be looking for the end ofthe transaction as a priority.

Here are my working props.conf and transforms.conf:


transforms.conf

[nomore]
LOOKAHEAD = 100000
REGEX=(?m)(404\sNot\sFound)
DEST_KEY=queue
FORMAT=nullQueue


props.conf

[mod_security]
SHOULD_LINEMERGE = true
MUST_NOT_BREAK_AFTER = (--[a-z0-9]+-A--)
MUST_BREAK_AFTER = (--[a-z0-9]+-Z--)
TRUNCATE = 0
TRANSFORMS-notfounderror = nomore

At least it is working now!

gkanapathy
Splunk Employee
Splunk Employee

You probably need:

[source::/nas/log/cache/httpd-www-80/*]

instead of

[source::/nas/log/cache/httpd-www-80/]

neusse
Path Finder

I made the change and it males no deference. I still get ALL events indexed. It seems this is a wide spread problem for folks, It would be nice if splunk had a little better method for filtering since there is a lot of noise in many logs.

I am at an impasse with this. For the volume of data, we need to filter to make better use of analysis time and system resources.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...