I have a random time out issue from Splunk forwarders to the Splunk intermediate (heavy) forwarder.
When I do netstat -al | grep 9997
, I get:
splunkndx-9997 SYN_SENT
splunkndx-9997 FIN_WAIT1
from universal forwarder side splunkd.log
08-16-2016 02:08:50.047 +0000 WARN TcpOutputProc - Forwarding to indexer group sid_9997 blocked for 1600 seconds.
08-16-2016 02:08:50.978 +0000 WARN TcpOutputProc - Cooked connection to ip=*.*.*.*:9997 timed out
08-16-2016 02:09:11.981 +0000 WARN TcpOutputProc - Cooked connection to ip=*.*.*.*:9997 timed out
When I do Telnet, 9 times out of 10, I get a timeout. Is there any config I can tune up or the heavy forwarders need to be resized? The CPU usage is less than 10%. Any suggestions?
This seems to be problem with routing taken to reach the forwarder rather than Splunk or CPU usage. Can you verify entries in routing table?