Deployment Architecture

Does Splunk need to go through all the file after restart before indexing new events?

quahfamili
Path Finder

Hi,

I would like to ask if anyone experience this or have a solution to this.

I have a forwarder which is reading many small log files (I mean in orders of millions) in a folder. Every time i made a change in configuration for a new index/.conf, I need to restart. And splunk will takes forever to goes through all these files before starting to index my new files.

Is there a way in the configuration to overcome this?

Thanks in advance.
Alan

0 Karma

FrankVl
Ultra Champion

I have similar experiences. Especially when the log files are on network shares rather than locally, that can take quite some time (even with much smaller numbers).

For you case: Are all those log files still active, or does it also contain rotated old files that are inactive? If there's a lot of old inactive files, there is a few options you can look at:
- put some cleanup script in place to get rid of those old files after some time
- If the old files can be recognized from their name (e.g. they get a suffix when rotated), write your input stanza such that they are ignored.
- Use the ignoreOlderThan setting in inputs.conf to ignore old files

If all those files are actively being written to, you could perhaps look at enabling multiple pipelines on your forwarder (if the hardware specs allow that), to enable Splunk to process multiple files in parallel.

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...