Splunk Enterprise Security

How to prevent congestion between Heavy Forwarders and Indexers?

vr2312
Contributor

We have observed yesterday that there was around 90+% of indexing queue on our indexers.

This resulted in failed connections between Heavy Forwarders (HF) and Indexers.

Once the indexing queue receded, data from HFs started flowing to indexers and data was then written to disks.

I have a few questions regarding this :

  1. Our environment hosts Splunk IT Service Intelligence and Splunk Enterprise Security, which are both premium apps. Would the searches targeting the indexers also a cause due to which there were blocked queues?
  2. What is the maximum TCP connections can an Indexer accept?
  3. Any inputs on how to avoid such cases in the future?
0 Karma

woodcock
Esteemed Legend

Due to MAJOR improvements in the S2S and the Universal Forwarder build, if you are on v6 (particularly later versions of v6), then you should only be using HFs for things like DBConnect. For things like syslog, you should DEFINITELY be using a Universal Forwarder. This is the answer to #3.

0 Karma

vr2312
Contributor

This is our infrastructure

Servers -> UF -> HF -> Indexers
Desktops -> UF -> HF -> Indexers
Syslog Servers -> HF -> Indexers
DBConnect HF -> Indexers

We are in version 6.4.4

0 Karma

woodcock
Esteemed Legend

Your architecture is very v4 and is now an albatross around your bottleneck. In the updated v6 hotness it should be like this:
Servers -> UF -> Indexers
Desktops -> UF -> Indexers
Syslog Servers -> UF -> Indexers
DBConnect HF -> Indexers

The key on all the UFs is to set autoLB=true and also EVENT_BREAKER for every input to ensure proper balancing. Do not use external Load Balancers, either.

0 Karma

vr2312
Contributor

Thank you for your inputs @woodcock , is there any documentation where this is published, so that i can take a look, read through and proceed on making these major changes.

Looking by the response, you are asking me to remove the HF tier completely. Am i getting this right ?

AutoLB is true with Indexer ACK enabled.

0 Karma

woodcock
Esteemed Legend

Keep HF for DBConnect only and yes, ditch the rest. The documentation about this evolution is not as clear as it should be but all of the testing that I have seen mirrors the PS scuttlebutt/buzz that I have been hearing about best practices having evolved to disclude HFs except in very (few) extreme circumstances. Here are a few places where there is some documentation:

https://www.splunk.com/blog/2014/03/18/time-based-load-balancing.html
http://docs.splunk.com/Documentation/Forwarder/6.6.0/Forwarder/Configureloadbalancing
https://docs.splunk.com/Documentation/Splunk/6.6.0/Admin/Outputsconf
forceTimebasedAutoLB = [true|false]
* Forces existing streams to switch to newly elected indexer every
AutoLB cycle.
* On universal forwarders, use the EVENT_BREAKER_ENABLE and
EVENT_BREAKER settings in props.conf rather than forceTimebasedAutoLB
for improved load balancing, line breaking, and distribution of events.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...