Splunk Search

How to filter out events in Splunk 6.3.1 at index-time, except files containing the string "#!" in the first 5 characters of the file?

stanvv
New Member

Hi,

I only want to index files containing the string #! in the first 5 characters of the file.
Therefore, I created the following inputs.conf:

[monitor:pathname] 
blacklist = (?i:archive|develop|data|backup|\.txt$|\.gz$|\.tar$|\.csv$|\.bck$|\.log$|\.old$|\d{6,})
disabled = false 
host = script 
index = abcindex 
sourcetype = abcscript

Props.conf:

[abcscript] 
TRANSFORMS-set= setnull,setparsing

Transforms.conf:

[setnull] 
REGEX = . 
DEST_KEY = queue
FORMAT = nullQueue

[setparsing] 
REGEX = (.{0,5}(#!))
DEST_KEY = queue
FORMAT = indexQueue

Based on http://docs.splunk.com/Documentation/Splunk/6.3.1/Forwarding/Routeandfilterdatad
Unfortunately, everything is indexed in the index "abcindex" at the moment, and not only files starting with #!
I also tried it with a dummy string in a dummy file, but again, everything is indexed.
Rebooted Splunk after changing config files.

Any idea what goes wrong here?
Using Splunk 6.3.1 at the moment.

Thanks

0 Karma

tmarlette
Motivator

Out of curiosity are you trying to do all of this on a universal forwarder?

If you are, adding these props/transforms to a UF they won't work, you have to add those settings to your indexing tier.

0 Karma

stanvv
New Member

I'm testing it on a local Splunk enterprise at the moment.

0 Karma

woodcock
Esteemed Legend

Your RegEx is wrong; try this:

REGEX = ^(.{0,3}(#!))

This needs to be deployed to all your indexers and the splunk instances running there need to be restarted. After this is done, incoming events will be properly filtered but events indexed before the restart will not be effected.

stanvv
New Member

Thanks for you answer. I tried the above (changed regex, rebooted and tried it with dummy files: one starting with #! and the other didn't) but still files not starting with #! were indexed.
I'm testing it on a local Splunk enterprise at the moment.

0 Karma

woodcock
Esteemed Legend

The RegEx applies to each event, not to the entire file.

0 Karma

stanvv
New Member

The files I'm monitoring are scripts (sometimes with an undefined filetype). So if the file content itself starts with #! I want it to be indexed. If it doesn't, it should go to the nullQueue.

Example
File 1 (needs to be indexed)

#!
########
# Intro 123
########
#Scriptinfo
ABC = 123

File 2 (send to nullQueue)

########
# Intro 234
########
#Scriptinfo
DEF = 567

Do props.conf and transforms.conf also work for non log/txt files? Any ideas what's the best solution?

0 Karma

woodcock
Esteemed Legend

Make sure you set this for your sourcetype in props.conf:

[YourSourcetypeHere]
LINE_BREAKER=(\Z)
TRUNCATE=500000
SHOULD_LINEMERGE = 1

This will treat the entire file as a single event and then it should work as you expect. Deploy this to the Indexers (or Heavy Forwarders) and restart all splunk instances there. This will apply ONLY TO FUTURE EVENTS (the scripts that are already there have already been processed) so you will have to create new files in order to test this.

0 Karma
Get Updates on the Splunk Community!

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...