I have a server which has 1million+ files, but at a time 5k files being generated. Splunk installation is unable to work with such high volumes.
What would be the ideal approach to onboard such volume of logs from a server?
Did you see this answer?
https://answers.splunk.com/answers/294295/is-there-a-limit-best-practice-to-how-many-data-in-1.html
In addition to the UF side, it is necessary to consider the configuration of the Splunk server and the daily log volume.
How about taking 1 UF, 1 pipeline(Default configuration) and trying out where the bottleneck will be?
However, please set the number of open files and transfer speed to appropriate values.
ignoreOlderThan
can be useful in your case at inputs.conf
Do keep in mind what ignoreOlderThan does, if your file stops been written to for a period of time longer than the ignoreOlderThan setting, the forwarder will not attempt to read that file until next restart.
For example, you are monitoring /a/webserver.txt
You have an ignoreOlderThan = 2d setting
/a/webserver.txt stops updating for >2 days.
Even if /a/webserver.txt gets updated , the universal forwarder will never attempt to read it, unless of course your restart the universal forwarder in which case the 2 day rule would apply again.
Did you see this answer?
https://answers.splunk.com/answers/294295/is-there-a-limit-best-practice-to-how-many-data-in-1.html
In addition to the UF side, it is necessary to consider the configuration of the Splunk server and the daily log volume.
How about taking 1 UF, 1 pipeline(Default configuration) and trying out where the bottleneck will be?
However, please set the number of open files and transfer speed to appropriate values.
Thanks! Above post answers most of my queries 🙂