Security

WARN TcpOutputProc - Forwarding to indexer group group1 blocked for 10 seconds.

brent_weaver
Builder

Hello there. I am seeing hte subject message in my HEC HWF servers. We are using index discovery and the following is my outputs.conf file:

[indexAndForward]
index = false

[tcpout]
defaultGroup = group1
forwardedindex.filter.disable = true
indexAndForward = false

[indexer_discovery:index_cluster]
pass4SymmKey = $1$C15X23+M+dxTVmLJ/AE=
master_uri = https://10.26.20.8:8089

[tcpout:group1]
autoLBFrequency = 30
forceTimebasedAutoLB = true
indexerDiscovery = index_cluster
useACK = true

The full context of the messges are:

12-07-2017 02:13:30.017 +0000 INFO  IntrospectionGenerator:resource_usage -   RU_main - I-data gathering (Resource Usage) starting; period=10s
12-07-2017 02:13:30.018 +0000 INFO  IntrospectionGenerator:resource_usage -   RU_main - I-data gathering (IO Statistics) starting; interval=60s
12-07-2017 02:13:31.465 +0000 INFO  TcpOutputProc - Initializing connection for non-ssl forwarding to 10.26.200.99:9997
12-07-2017 02:13:31.465 +0000 INFO  TcpOutputProc - Initializing connection for non-ssl forwarding to 10.26.200.187:9997
12-07-2017 02:13:31.465 +0000 INFO  TcpOutputProc - Initializing connection for non-ssl forwarding to 10.26.201.36:9997
12-07-2017 02:13:31.465 +0000 INFO  TcpOutputProc - Initializing connection for non-ssl forwarding to 10.26.200.73:9997
12-07-2017 02:13:31.465 +0000 INFO  TcpOutputProc - Initializing connection for non-ssl forwarding to 10.26.201.72:9997
12-07-2017 02:13:31.465 +0000 INFO  TcpOutputProc - Initializing connection for non-ssl forwarding to 10.26.200.155:9997
12-07-2017 02:13:31.465 +0000 INFO  TcpOutputProc - Will resolve indexer names at 450 second interval.
12-07-2017 02:13:38.383 +0000 INFO  DC:DeploymentClient - channel=tenantService/handshake Will retry sending handshake message to DS; err=not_connected
12-07-2017 02:13:48.024 +0000 INFO  TailReader - Could not send data to output queue (parsingQueue), retrying...
12-07-2017 02:13:50.383 +0000 INFO  DC:DeploymentClient - channel=tenantService/handshake Will retry sending handshake message to DS; err=not_connected
12-07-2017 02:13:56.310 +0000 INFO  TcpOutputProc - Initialization time for indexer discovery service for default group=group1 has been completed.
12-07-2017 02:13:56.313 +0000 INFO  TcpOutputProc - Connected to idx=10.26.200.73:9997 using ACK.
12-07-2017 02:13:56.962 +0000 INFO  TailReader -   ...continuing.
12-07-2017 02:13:57.763 +0000 INFO  KeyManagerLocalhost - Checking for localhost key pair
12-07-2017 02:13:57.764 +0000 INFO  KeyManagerLocalhost - Public key already exists: /opt/splunk/etc/auth/distServerKeys/trusted.pem
12-07-2017 02:13:57.764 +0000 INFO  KeyManagerLocalhost - Reading public key for localhost: /opt/splunk/etc/auth/distServerKeys/trusted.pem
12-07-2017 02:13:57.764 +0000 INFO  KeyManagerLocalhost - Finished reading public key for localhost: /opt/splunk/etc/auth/distServerKeys/trusted.pem
12-07-2017 02:13:57.764 +0000 INFO  KeyManagerLocalhost - Reading private key for localhost: /opt/splunk/etc/auth/distServerKeys/private.pem
12-07-2017 02:13:57.764 +0000 INFO  KeyManagerLocalhost - Finished reading private key for localhost: /opt/splunk/etc/auth/distServerKeys/private.pem
12-07-2017 02:13:57.859 +0000 INFO  HttpPubSubConnection - SSL connection with id: connection_10.26.210.210_8089_ip-10-26-210-210.ec2.internal_ip-10-26-210-210_predix_aws_us_east_1_hec
12-07-2017 02:13:57.862 +0000 INFO  HttpPubSubConnection - Running phone uri=/services/broker/phonehome/connection_10.26.210.210_8089_ip-10-26-210-210.ec2.internal_ip-10-26-210-210_predix_aws_us_east_1_hec
12-07-2017 02:14:01.965 +0000 INFO  TailReader - Could not send data to output queue (parsingQueue), retrying...
12-07-2017 02:14:02.384 +0000 INFO  HttpPubSubConnection - Running phone uri=/services/broker/phonehome/connection_10.26.210.210_8089_ip-10-26-210-210.ec2.internal_ip-10-26-210-210_predix_aws_us_east_1_hec
12-07-2017 02:14:02.385 +0000 INFO  DC:HandshakeReplyHandler - Handshake done.
12-07-2017 02:14:06.079 +0000 WARN  TcpOutputProc - Forwarding to indexer group group1 blocked for 10 seconds.
12-07-2017 02:14:11.965 +0000 INFO  TailReader -   ...continuing.
12-07-2017 02:14:19.966 +0000 INFO  TailReader - Could not send data to output queue (parsingQueue), retrying...
12-07-2017 02:14:21.030 +0000 WARN  TcpOutputProc - Forwarding to indexer group group1 blocked for 10 seconds.
12-07-2017 02:14:26.220 +0000 INFO  TcpOutputProc - Closing stream for idx=10.26.200.73:9997
12-07-2017 02:14:26.221 +0000 INFO  TcpOutputProc - Connected to idx=10.26.200.187:9997 using ACK.
12-07-2017 02:14:26.386 +0000 INFO  TailReader -   ...continuing.
12-07-2017 02:14:34.439 +0000 INFO  TailReader - Could not send data to output queue (parsingQueue), retrying...
12-07-2017 02:14:36.012 +0000 WARN  TcpOutputProc - Forwarding to indexer group group1 blocked for 10 seconds.

Any thoughts are more than welcome. Again this is only happenning on my HEC HWF. Thanks!

Tags (1)

brent_weaver
Builder

SO after digging deeper into this issue, it seems that my indexers may not be setup correct. It appears that THP is set to mvadvise which should be OK but I am not convinced that it is working. Could this be causing the issue on the indexers as well as the HEC's?

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

It looks like parsingQueue blocked due to various reason , if you refer answer on https://answers.splunk.com/answers/5590/could-not-send-data-to-the-output-queue.html there is very good explanation provided for same type of problem.

0 Karma

brent_weaver
Builder

Hey there - it turns out that this issue only happens when I enable useACK. I have zero network connectivity issues and am just completely stumped on this one.

ANY help is MUCH appreciated, as Splunk support has no idea what the problem is either.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

If you read this documentation http://docs.splunk.com/Documentation/Splunk/7.0.1/Forwarding/Protectagainstlossofin-flightdata#How_t... and next topics in same page then you will able to get idea that how acknowledgement works.

There might be several reason that your forwarder's output queue is full. Based on the doc.

A wait queue can fill up when something is wrong with the network or indexer; however, it can also fill up even when the indexer is functioning normally. This is because the indexer only sends the acknowledgment after it has written the data to the file system. Any delay in writing to the file system will slow the pace of acknowledgment, leading to a full wait queue.

There are a few reasons that a normal functioning indexer might delay writing data to the file system (and so delay its sending of acknowledgments):

    The indexer is very busy. For example, at the time the data arrives, the indexer might be dealing with multiple search requests or with data coming from a large number of forwarders.
    The indexer is receiving too little data. For efficiency, an indexer only writes to the file system periodically -- either when a write queue fills up or after a timeout of a few seconds. If a write queue is slow to fill up, the indexer will wait until the timeout to write. If data is coming from only a few forwarders, the indexer can end up in the timeout condition, even if each of those forwarders is sending a normal quantity of data. Since write queues exist on a per hot bucket basis, the condition occurs when some particular bucket is getting a small amount of data. Usually this means that a particular index is getting a small amount of data.

If any of the above clue does not help you then I'll suggest to increase queue size as per documentation and check whether it will help or not.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...