Getting Data In

Tailing processor and rsync (dot files) - blacklist?

howyagoin
Contributor

Easy one, this, but I can't seem to get it right.

I'm monitoring a series of directories which are rsync'd from other servers. Splunk, being ever so efficient, is managing to index the . files that rsync creates, as well as the files after they arrive. This has resulted in rather a lot of unnecessary data.

The answer, to me, should be either whitelists or blacklists.

For one of the directories, I can whitelist, as the files are all "blah.log" and thus "blah.log$" should work fine.

However, in other directories the files are named all sorts of things, and there's no easy regex to whitelist. So a blacklist should do the trick. But I can't seem to get a regex working for "any file starting with a ."

Hints?

Tags (1)
0 Karma
1 Solution

howyagoin
Contributor

There was something definitely amiss with the ability to parse recursive directories and use whitelist/blacklists, so I've gone ahead and created a monitor stanza in my inputs.conf for each of the 8 files. That was the only thing that got Splunk to actually show the content of those files in a query.

View solution in original post

0 Karma

howyagoin
Contributor

There was something definitely amiss with the ability to parse recursive directories and use whitelist/blacklists, so I've gone ahead and created a monitor stanza in my inputs.conf for each of the 8 files. That was the only thing that got Splunk to actually show the content of those files in a query.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee
blacklist = /\.[^/]+$

should do it

southeringtonp
Motivator

What does your current regex look like? Make sure you're not forgetting to put a slash in front of the dot, or it will think it's a wildcard.

Have you tried just:

blacklist=^\.

(For older versions of Splunk, use _blacklist instead of blacklist)

0 Karma

howyagoin
Contributor

I've put in gkanapathy's for now, but, I think something is wrong with my whitelist -- is there any potential interaction between whitelists and monitoring directories which have sub-directories (and it's in the sub-directories where my files are)?

I now have:


[monitor:///Volumes/A/b/c]
crcSalt = <SOURCE>
disabled = false
followTail = 0
host = strawberry
index = submarine
whitelist = submarine\.out$
sourcetype = log4j

However, my files are actually located in:

/Volumes/A/b/c/cluster3/data/instance/box-4/logs
/Volumes/A/b/c/cluster2/data/instance/box-3/logs
/Volumes/A/b/c/cluster2/data/instance/box-1/logs

And so on. A list of about 8 or so locations, but, since they're all under "c" I just pointed Splunk at that.

According to the inputstatus Tailing Processor URL, it's found "c" and some files in "c" which did not match the whitelist, but there's no indication that data in the rest of the path, and it's definitely not in the index (yesterday's data is, before I made this whitelist change).

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...