Getting Data In

Whitelist/blacklist multiple files in same directory

attgjh1
Communicator

ive been reading the documentation and am stumped at this part:

If you create a blacklist line for each file you want to ignore, Splunk activates only the last filter.

There are 7 different kinds of logs. 4 of them has its own sourcetype, while the other 3 are to be blacklisted as they aren't required. All of them are logged to the same directory(something which I cant change). Based on the above, im confused to whether i can blacklist those 3 types logs. Similarly or otherwise, am i allowed to configure 4 "indexers" to identify and filter the individual logs for indexing?

some examples of logfile names:

  1. pulsar_handler_.2011-12-21-00
  2. admin_report.2011-12-21-00
  3. inbound_handler_bt091.2011-12-22-00

i.e. the extensions are actually the date of the logs with the last digits representing the hours. my regex will be limited to the first few phrases that identifies the log type.

i have tried:
[monitor://C:\Documents and Settings\attgjh1\Desktop\Whitelist test\cei_inbound_handler*]
disabled = false
followTail = 0
sourcetype = Whitelist_inbound

[monitor://C:\Documents and Settings\attgjh1\Desktop\Whitelist test\cei_pulsar_handler*]
disabled = false
followTail = 0
sourcetype = Whitelist_pulsar

It doesnt seem to be indexing my logs now

Thanks alot!

Tags (2)
0 Karma
1 Solution

attgjh1
Communicator

thanks Iguinn, i have an error when i tried to create another whitelist for the same directory. (something related to identical directory been used before)

I tried ur suggestionin my inputs.conf:

[monitor://C:\Documents and Settings\attgjh1\Desktop\Whitelist test\cei_inbound_handler*]
disabled = false
followTail = 0
sourcetype = Whitelist_inbound

[monitor://C:\Documents and Settings\attgjh1\Desktop\Whitelist test\cei_pulsar_handler*]
disabled = false
followTail = 0
sourcetype = Whitelist_pulsar

EDIT: extracted out from the long list of comments as this is correct answer

View solution in original post

0 Karma

attgjh1
Communicator

thanks Iguinn, i have an error when i tried to create another whitelist for the same directory. (something related to identical directory been used before)

I tried ur suggestionin my inputs.conf:

[monitor://C:\Documents and Settings\attgjh1\Desktop\Whitelist test\cei_inbound_handler*]
disabled = false
followTail = 0
sourcetype = Whitelist_inbound

[monitor://C:\Documents and Settings\attgjh1\Desktop\Whitelist test\cei_pulsar_handler*]
disabled = false
followTail = 0
sourcetype = Whitelist_pulsar

EDIT: extracted out from the long list of comments as this is correct answer

0 Karma

lguinn2
Legend

In older Splunk versions, things worked differently. I think this text is a holdover from olden times. Maybe.

But if you are talking about blacklisting inputs, in inputs.conf you can specify

[monitor:///pathtologdir]
blacklist=pulsar_handler|admin_report

And that will not index files that contain strings matching either pulsar_handler or admin_report

The vertical bar | means or in regular expressions. You can add more to the list, of course.

You can also specify the blacklist in the GUI, if you click on More Options.

RE: "multiple indexers", I am not sure what you mean. If you simply blacklist the files that you don't want, all the other files in the directory will be indexed. If you need to specify the sourcetypes for these files, you can do it in props.conf.

attgjh1
Communicator

ok. i recreated fake test logs with simple data and it worked based on your solution. i guess its some error on my part on my actual logs now.

thanks alot for your help!

0 Karma

lguinn2
Legend

| extract reload=t only re-applies the settings from props.conf and transforms.conf

If you are manually editing inputs.conf (which you are), you must restart splunk for the changes to take effect.

0 Karma

attgjh1
Communicator

3.
posted my question on Splunk Answers.
went ahead and did the changes in inputs.conf
| extract reload=t
to get the settings restarted.
splunk stopped indexing either kind of logs.

that's the part that stumped me. the changes on inputs.conf can be found above in my question.
i just started learning and using websplunk, hence, directly editing inputs.conf has been a confusing trial and error thing on my part 😞

Once again, thanks a lot so far.

0 Karma

attgjh1
Communicator

hmm. strangely enough, they are disabled already.

ill list down the steps i took since i posted this question.

1.
i used websplunk default whitelist setting and keyed:
"cei_inbound_handler"
fed in those logs backdated 2 years ago. 6 events were recorded in there. it ignored other files due to whitelist. (working as intended)

2.
i attempted to create a 2nd monitor in the same directory for "cei_pulsar_handler", was unable to do so as Splunk does not allow multiple monitors.

continued in next comment

0 Karma

lguinn2
Legend

Aha! Check out these settings in props.conf: MAX_DAYS_AGO, MAX_DIFF_SECS_AGO, MAX_DIFF_SECS_HENCE

http://docs.splunk.com/Documentation/Splunk/latest/admin/propsconf

You might also want to check out ignoreOlderThan in inputs.conf -- though this is disabled by default.

I am wondering if Splunk is not indexing this data because it is "too old"

0 Karma

attgjh1
Communicator

i changed all the timestamps associated with each event. set them 2 years back. the data itself are often repeated with varying timestamps itself. Splunk never had a problem reading them separately as long as there was a time difference on the events.

it worked the first time when i whitelisted only one log file (so i think im feeding new test data correctly) but the problem pop'd up when i try to monitor the same directory with a different timestamp, which is an error itself. so now im trying to figure out the solution u psoted. but it doesnt seem to be working now =/

0 Karma

lguinn2
Legend

By "modify the time stamps", do you mean that you changed the data in the file, or just changed the timestamp of the file?

0 Karma

attgjh1
Communicator

im using a new test data log by modifying all the timestamps.
but even with spaces in my directory, i had no problem monitoring them initially.

as brought up earlier. the monitor must be unique, but my logs are all in the same directory (no more subfolders), so i cant simply add more monitors with their whitelist. was working fine if i only wanted to monitor and filter only one of the logs.

0 Karma

lguinn2
Legend

I notice that you are using a directory name with spaces in it - can you avoid that or escape it?

Also, are you reusing test files for this? Splunk will not index a file that it has already indexed, even if you move it to a different directory and rename it. You can fix this using crc salt - although usually you don't want Splunk to be indexing data again! If this is really just test data, you can also delete the first few lines of each file - which will make Splunk see it as a different file.

http://docs.splunk.com/Documentation/Splunk/latest/Admin/Inputsconf

0 Karma

lguinn2
Legend

The easiest way to do what you listed in your comment:

[monitor:///pathtoyourfile/pulsar_handler*]
sourcetype=pulsar

[monitor:///pathtoyourfile/admin_report*]
sourcetype=admin

It is more efficient to put the file in the monitor stanza. No need for the whitelist. Also, this will prevent possible problems, because you cannot have identical monitor stanzas.

attgjh1
Communicator

just to clarify:

im using web splunk. since the indexer is already monitory a directory, in order for it to not monitor other files other than the one specified my whitelist just have to include this regex right?

i.e. in Manager » Data inputs » Files & directories, for each "path to data" settings, i only need to modify the whitelist field:

whitelist = "pulsar_handler" for sourcetype: pulsar
whitelist = "admin_report" for sourcetype: admin

thanks for your help!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...