Splunk Enterprise Security

How to prevent congestion between Heavy Forwarders and Indexers?

vr2312
Contributor

We have observed yesterday that there was around 90+% of indexing queue on our indexers.

This resulted in failed connections between Heavy Forwarders (HF) and Indexers.

Once the indexing queue receded, data from HFs started flowing to indexers and data was then written to disks.

I have a few questions regarding this :

  1. Our environment hosts Splunk IT Service Intelligence and Splunk Enterprise Security, which are both premium apps. Would the searches targeting the indexers also a cause due to which there were blocked queues?
  2. What is the maximum TCP connections can an Indexer accept?
  3. Any inputs on how to avoid such cases in the future?
0 Karma

woodcock
Esteemed Legend

Due to MAJOR improvements in the S2S and the Universal Forwarder build, if you are on v6 (particularly later versions of v6), then you should only be using HFs for things like DBConnect. For things like syslog, you should DEFINITELY be using a Universal Forwarder. This is the answer to #3.

0 Karma

vr2312
Contributor

This is our infrastructure

Servers -> UF -> HF -> Indexers
Desktops -> UF -> HF -> Indexers
Syslog Servers -> HF -> Indexers
DBConnect HF -> Indexers

We are in version 6.4.4

0 Karma

woodcock
Esteemed Legend

Your architecture is very v4 and is now an albatross around your bottleneck. In the updated v6 hotness it should be like this:
Servers -> UF -> Indexers
Desktops -> UF -> Indexers
Syslog Servers -> UF -> Indexers
DBConnect HF -> Indexers

The key on all the UFs is to set autoLB=true and also EVENT_BREAKER for every input to ensure proper balancing. Do not use external Load Balancers, either.

0 Karma

vr2312
Contributor

Thank you for your inputs @woodcock , is there any documentation where this is published, so that i can take a look, read through and proceed on making these major changes.

Looking by the response, you are asking me to remove the HF tier completely. Am i getting this right ?

AutoLB is true with Indexer ACK enabled.

0 Karma

woodcock
Esteemed Legend

Keep HF for DBConnect only and yes, ditch the rest. The documentation about this evolution is not as clear as it should be but all of the testing that I have seen mirrors the PS scuttlebutt/buzz that I have been hearing about best practices having evolved to disclude HFs except in very (few) extreme circumstances. Here are a few places where there is some documentation:

https://www.splunk.com/blog/2014/03/18/time-based-load-balancing.html
http://docs.splunk.com/Documentation/Forwarder/6.6.0/Forwarder/Configureloadbalancing
https://docs.splunk.com/Documentation/Splunk/6.6.0/Admin/Outputsconf
forceTimebasedAutoLB = [true|false]
* Forces existing streams to switch to newly elected indexer every
AutoLB cycle.
* On universal forwarders, use the EVENT_BREAKER_ENABLE and
EVENT_BREAKER settings in props.conf rather than forceTimebasedAutoLB
for improved load balancing, line breaking, and distribution of events.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...