On a universal forwarder that is apparently sending data, there are a large number (5.5k of blocked=true queue messages for indexqueue and auditqueue. Why is indexqueue showing up on a universal forwarder?
03-14-2014 14:50:44.502 +0000 INFO Metrics - group=queue, name=auditqueue, blocked=true, max_size_kb=500, current_size_kb=499, current_size=1109, largest_size=1109, smallest_size=1109
03-14-2014 14:50:44.502 +0000 INFO Metrics - group=queue, name=indexqueue, blocked=true, max_size_kb=500, current_size_kb=499, current_size=1110, largest_size=1110, smallest_size=1110
First off, the index queue is only present on universal forwarders as a recipient for the TCPout processor, responsible for sending the data out to configured receivers. No actual indexing activity is in the picture, of course.
In all likelihood, one of three things is happening:
The UF's self-imposed outgoing throughput limit (the "maxKBps" key in the "thruput" stanza of limits.conf) is kicking in and throttling the emission of data. Note that the default value for this setting is only 256 KBps.
The indexer(s) downstream are saturated and cannot keep up with the current aggregated indexing rate. As a result, their queues are saturated and forwarders are being turned down. The next steps in this investigation are to examine the pattern of event-processing queue saturation and of CPU usage of splunkd processors, which can be done with the "Indexing Performance" and "Distributed Indexing Performance" views provided with the S.o.S app.
If indexers are fine and no local throughput limit is in place, this could be a network throughput limitation between the UF and the indexers. This is rare, but does happen. It's also not easy to demonstrate, unfortunately.
First off, the index queue is only present on universal forwarders as a recipient for the TCPout processor, responsible for sending the data out to configured receivers. No actual indexing activity is in the picture, of course.
In all likelihood, one of three things is happening:
The UF's self-imposed outgoing throughput limit (the "maxKBps" key in the "thruput" stanza of limits.conf) is kicking in and throttling the emission of data. Note that the default value for this setting is only 256 KBps.
The indexer(s) downstream are saturated and cannot keep up with the current aggregated indexing rate. As a result, their queues are saturated and forwarders are being turned down. The next steps in this investigation are to examine the pattern of event-processing queue saturation and of CPU usage of splunkd processors, which can be done with the "Indexing Performance" and "Distributed Indexing Performance" views provided with the S.o.S app.
If indexers are fine and no local throughput limit is in place, this could be a network throughput limitation between the UF and the indexers. This is rare, but does happen. It's also not easy to demonstrate, unfortunately.