Getting Data In

fschange with recurse=true - unexpected results from whitelist

jbidinger
Explorer

I'm trying to monitor the xml files that define a Solaris service. These files live under /var/svc/manifest/.../*.xml.

/var/svc/manifest/application/stosreg.xml
/var/svc/manifest/application/management/wbem.xml
/var/svc/manifest/network/rpc/rstat.xml
/var/svc/manifest/network/rpc/bind.xml
/var/svc/manifest/network/rpc/wall.xml
/var/svc/manifest/platform/sun4u/oplhpd.xml
/var/svc/manifest/milestone/multi-user.xml
/var/svc/manifest/system/console-login.xml
/var/svc/manifest/system/mdmonitor.xml

I have the following defined in my inputs.conf:

[filter:whitelist:xml_files]
regex1 = \.xml$

[filter:blacklist:terminal-blacklist]
regex1 = .?

[fschange:/var/svc/manifest]
sourcetype = solaris_etc
index = fileint
filters = xml_files, terminal-blacklist
disabled = false
recurse = true
pollPeriod = 300
fullEvent = true
sendEventMaxSize = -1

I'm using the whitelist regex for another fschange and it does match the xml files. The problem I'm having is that when recurse=true it doesn't appear to match anymore. I've tried variations such as .*\/.*\.xml, etc and nothing seems to help.

According to this page in the docs: http://www.splunk.com/base/Documentation/4.1.4/Admin/Monitorchangestoyourfilesystem it should be working.

Any help is greatly appreciated.

  • Jon
Tags (2)
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

The problem is that fschange whitelists and blacklists don't work the way you (or probably anyone else) would want them to work.

If you don't recurse, everything is fine, as files in the current directory that match the path get indexed, and others don't.

The problem when you recurse is that directories underneath get the same whitelists and blacklists applied, and any directory that gets blacklisted is skipped, i.e., files within such a directory are all blacklisted.

I am not sure if this will work, but you can try adding a filter:

[filter:whitelist:directories]
regex1 = \/$

and adding that to your filters list. I have a feeling that it won't work, but if it does, you're okay. If it doesn't, you're kind of out of luck unless you can come up with some regex to distinguish between files and directories (or have a list of valid subdirectories), e.g., if you assume files have a . in the name, maybe:

[filter:whitelist:directories]
regex1 = /[^/\.]+$

Note that this problem applies recursively to subdirectories as well. I concede that it is bad.

This behavior is accurately noted and described in the docs: http://www.splunk.com/base/Documentation/latest/Admin/Monitorchangestoyourfilesystem#Configure_the_f...

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

Given the way the whitelisting works, and since it appears that you're trying to index the full file, there is another way to get the result you want. You can use the regular log file monitoring rather than the fschange monitor to get the full file, with some settings for the source type. In inputs, you would:

[monitor:///var/svc/manifest/]
whitelist = \.xml$
sourcetype = solaris_etc
index = fileint

in props.conf:

[solaris_etc]
DATETIME_CONFIG = NONE
CHECK_METHOD = entire_md5
TRUNCATE = 0
LINE_BREAKER = (?!)

This should wind up looking the same, with the bonus that you won't have a poll period so the changes should be detected more quickly.

gkanapathy
Splunk Employee
Splunk Employee

The problem is that fschange whitelists and blacklists don't work the way you (or probably anyone else) would want them to work.

If you don't recurse, everything is fine, as files in the current directory that match the path get indexed, and others don't.

The problem when you recurse is that directories underneath get the same whitelists and blacklists applied, and any directory that gets blacklisted is skipped, i.e., files within such a directory are all blacklisted.

I am not sure if this will work, but you can try adding a filter:

[filter:whitelist:directories]
regex1 = \/$

and adding that to your filters list. I have a feeling that it won't work, but if it does, you're okay. If it doesn't, you're kind of out of luck unless you can come up with some regex to distinguish between files and directories (or have a list of valid subdirectories), e.g., if you assume files have a . in the name, maybe:

[filter:whitelist:directories]
regex1 = /[^/\.]+$

Note that this problem applies recursively to subdirectories as well. I concede that it is bad.

This behavior is accurately noted and described in the docs: http://www.splunk.com/base/Documentation/latest/Admin/Monitorchangestoyourfilesystem#Configure_the_f...

gkanapathy
Splunk Employee
Splunk Employee

I have another answer that I will post for you in a bit that I think will solve your problem.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

If you are trying to do full events, I also have another answer for you.

0 Karma

jbidinger
Explorer

In my particular case, it isn't realistic to try to whitelist all of the potential subdirectories since they are not necessarily defined.

Perhaps a potential feature would be the option of a "depth-first" match. Where only the files are compared against the regex.

I can see where both types of behaviors would be handy.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...