I'm trying to index all the files marked with a [Y] in the directory structure below.
[Y] - /tmp/test.log
[Y] - /tmp/logs/test.log
[Y] - /tmp/logs/test.log.20160218
[N] - /tmp/logs/test.log.20160218.gz
[N] - /tmp/logs/test.log.20160218.out
[N] - /tmp/logs/test.log20160218
[N] - /tmp/logs/test.log20160218.gz
My monitor stanza in inputs.conf is as follows:
[monitor:///tmp/*/test.(log|log\.[0-9]+)]
index = splunkprod
sourcetype = testdata
ignoreOlderThan = 5d
However, it does not pick up anything. Does anybody know why? Thanks
The stanza name ("monitor:...") cannot contain regular expressions, only wildcards.
Also, the path specified in inputs.conf
doesn't match your example files - the examples don't start with 'xfer'.
The /xfer in the monitor path was a typo. Updated the question.
Per http://docs.splunk.com/Documentation/Splunk/latest/Data/Specifyinputpathswithwildcards under Wildcards and regular expression metacharacters, it states that:
[monitor://var/.../log[A-Z0-9].log]
Splunk Enterprise treats [A-Z0-9] as a regex because of the wildcard '...' in the previous stanza segment.
I would expect it to consider it as a regex as the "..." wildcard is present. Is the documentation not accurate?
The very first line on this document say "Input path specifications in inputs.conf do not use regular expressions (regexes) but rather Splunk-defined wildcards. " and the second section specifies that it supports "Wildcards and regular expression metacharacters". For the section you're referring, Splunk is treating [A-Z0-9]
as regular expression metacharacters if a wildcard (asterisk *
) is used in the monitoring path.
Ah, thanks. I missed that.
Try like this
[monitor:///temp]
recursive = true
index = splunkprod
sourcetype = testdata
ignoreOlderThan = 5d
whitelist = (test\.log$|test\.log\.\d+$)
If the directory /tmp has a lot of files and subdirectories, are there any performance implications to monitoring /tmp with a whitelist? Does it initially generate a list of all files and folders in that directory and then prunes it using the whitelist?
Yes, more the wildcards, more number of files Splunk has to keep track of. The whitelist/blacklist makes things easy little bit. If there are too many files folders under /tmp, I would suggest to split the monitoring into two part, 1 for specific file(s) under /tmp
and other for all files under /tmp/logs
.
If I were to split the monitoring up, can I still redirect them to the same sourcetype?
Yes, you can. As long as the monitoring stanza ([monitor://.....]
)differs, you can create multiple stanzas with same index/sourcetype/whitelist etc.