Getting Data In

Why is my universal forwarder not evenly spreading events coming from a high-traffic input input across all indexers?

hexx
Splunk Employee
Splunk Employee

I have noticed that universal forwarders receiving data from a high-traffic input will fail to distribute events evenly across all indexers in the autoLB group defined in outputs.conf.

What is the reason for this? Can anything be done to prevent it?

1 Solution

Mick
Splunk Employee
Splunk Employee

In order to address this issue and spread the data evenly, use a regular (heavy) forwarder to collect the data and parse it before sending it to the indexer.

With the Universal Forwarder, minimal parsing is performed on the forwarder side before sending the data onwards. This means that the UF has no idea where line-breaks occur between events, so in order to use auto-LB, it has to wait until there's a break in the data-stream before switching the output connection to a new indexer. The same behaviour would be observed if it was monitoring a file, and the logging application never stopped writing to that file. As long as data from a specific source keeps appearing fast enough, the UF will continue to send that data to a single indexer in order to avoid corruption of the index.

A regular forwarder will parse the data fully parsed before being sending it to the indexers, making it easy to identify points where the connection can be switched. Note that using this instance will increase the resource usage on the host server, so if that box is running critical applications, we should advise using a separate, dedicated box for this
purpose.

View solution in original post

Mick
Splunk Employee
Splunk Employee

In order to address this issue and spread the data evenly, use a regular (heavy) forwarder to collect the data and parse it before sending it to the indexer.

With the Universal Forwarder, minimal parsing is performed on the forwarder side before sending the data onwards. This means that the UF has no idea where line-breaks occur between events, so in order to use auto-LB, it has to wait until there's a break in the data-stream before switching the output connection to a new indexer. The same behaviour would be observed if it was monitoring a file, and the logging application never stopped writing to that file. As long as data from a specific source keeps appearing fast enough, the UF will continue to send that data to a single indexer in order to avoid corruption of the index.

A regular forwarder will parse the data fully parsed before being sending it to the indexers, making it easy to identify points where the connection can be switched. Note that using this instance will increase the resource usage on the host server, so if that box is running critical applications, we should advise using a separate, dedicated box for this
purpose.

sonicZ
Contributor

Hi Mick,

Thanks for this answer we are having this exact problem, that about just under 100gigs a day. Here is one of our daily reports.

host sum(mb)
splunk-w1-inf53 7209.3280268010
splunk1-d1-inf 24543.7608717865
splunk2-d2-inf 17171.5732630553
splunk2-w2-inf5 9376.1935319420
splunk3-d1-inf 16665.1996564898
splunk4-d2-inf 24236.0354538779

As you can see splunk1-d1-inf and splunk4-d2-inf got 24 gigs compared to the others which got 7-16 gigs

Seems rather randomly we get big bursts of data on some indexers compared to others.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...