I'm planning a Splunk deployment that will involve 2 indexers, 1 search head and 4 forwarders spread across various networks in various geographic locations. We are using a forwarder (UF or HWF, we're not settled on which one yet) to collect syslog traffic on private networks and forward this traffic over a VPN to a remote site where the index and search heads will be located.
In our labs we've spent time testing the forwarding speed we can expect to achieve and found that compression severely affects the maximum EPS results, however Splunk appears to have very low CPU utilisation when compression is enabled.
Servers:
Testing scenario:
The most representative example I can give of the problem is this:
However:
The concerning factor for us is the relatively light CPU usage. We will be deploying forwarders on dedicated hardware and we want Splunk to utilise as much of their power as possible.
Are there any settings in Splunk that we can tune or tweak to make Splunk UF/HWF more "greedy" and help increase our EPS rate with compression? We've tweaked all the options we can find in Splunk inputs.conf, outputs.conf and at a kernel level for UDP buffers and queues. However this just delays the appearance of the problem of being limited to ~14k EPS when compression is enabled, whilst using relatively low resource utilisation.
Thanks!
Suggestions to improve thruput:
maxKBps
setting in the [thruput]
section of limits.conf. The Light and Universal Forwarders by default cap at 256 kilobytes/sec unless otherwise overridden, and I believe that cap is pre-compression.Suggestions to improve thruput:
maxKBps
setting in the [thruput]
section of limits.conf. The Light and Universal Forwarders by default cap at 256 kilobytes/sec unless otherwise overridden, and I believe that cap is pre-compression.Thank you! SSL forwarding did the trick, event rates were back up toward 100k EPS using slightly less bandwidth than having normal compression on.
Based on your recommendation I've also put rsyslog in place to capture the events. Thanks for the tip, it should save us losing events should Splunk crash.
Thank you very much for the detailed reply, I'll be sure to try each of these points when I'm back in the office on Monday. We have tried rsyslog to capture UDP syslog which dropped our top EPS down to ~85k EPS, but still couldn't top 15k compressed.
It sounds like SSL could be a good option for us, I'm aware that UDP syslog -> TCP forwarding has a big overhead with the 64k blocks and SSL could help a lot.