Getting Data In

Is load balancing in UF/HF bound to cause data imbalance on IDXC eventually?

morethanyell
Builder

Is load balancing in HF and UF's outputs.conf bound to cause data imbalance on IDXC overt time? If yes, I wholeheartedly accept that data rebalancing is something that we need to do in a regular basis. Just need confirmation from the crowd / community.

If not and in which it means that the line (according to this Splunk doc)

"every 30 seconds, the forwarder switches the data stream to another indexer in the group, selected at random"

guarantees to make number of data among the peers balance. If that is really the case, then what causes data imbalance and how to prevent it?

Thanks in advance.

0 Karma

anmolpatel
Builder

My two cents worth. An event will be of various size which will be dependent upon day and time of day. So imbalance is inevitable when dealing with 100's of GB to TB+ data.

Don't rely on the default values for the data spread to have minimise imbalance.
Would configure to have smaller burst of data across various indexers in the cluster. EG:

[tcpout]
autoLBFrequency = 15
autoLBVolume = 500000 ### the max data size you would want to send to and IDX incase frequency threshold isn't met
forceTimebasedAutoLB = true  ### Note: you would also want to check you've the right props key/value pair in place for Event_Breaker. 

Is rebalancing something to do regularly, would say yes. Frequency, that will be environment dependent where you would have set maintenance windows. There is also criticality of the data when it comes to security data sources, so would carry our risk analysis to weight in.

Here is a splunk .conf presentation that explains it quite well:
https://conf.splunk.com/files/2016/slides/rebalancing-data-across-an-indexer-cluster.pdf

morethanyell
Builder

Enlightening. Thanks.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...