Deployment Architecture

Why are we getting "Replication factor not met" in our multisite indexer clustering environment?

darshan_singh01
Path Finder

We have configured multisite indexer clustering (2 peers at each site1/2 and one search head at site 1) with the below settings of server.conf of the master server and indexers .

Master server.conf

[general]
pass4SymmKey = $1$xNRfsRamx/pN
site = site1

[clustering]
available_sites = site1,site2
mode = master
multisite = true
pass4SymmKey = $1$9MxSqh+o6q08TJov
site_search_factor = origin:1,total:2
site_replication_factor = origin:2,total:3

.....................................
Indexers server.conf:

[general]
site = site1

[replication_port://7778]

[clustering]
master_uri = https://x.x.x.x:8089
mode = slave
pass4SymmKey = whatever

We are getting “Replication factor not met” error on the master server's dashboard and "Missing enough suitable candidates to create replicated copy in order to meet replication policy. Missing={ site2:1 } " error.

Only 4 audit and 4 _internal index buckets are not replicating. All the rest and main index buckets are replicating ok .plz help

0 Karma
1 Solution

dxu_splunk
Splunk Employee
Splunk Employee

These are likely pre-multisite buckets. (you can tell by going to the cluster master endpoint /services/cluster/master/buckets?filter=replication_count<3, and note that the buckets there have constrain_to_origin_site = 1)

see answers question / docs

you can try setting replication_factor=2 in the cluster master server.conf, and restart the master - this should then show rf/sf met. (we don't replicate pre multisite buckets across sites, so since you have 2 indexers per site, setting replication_factor=2 makes sense)

View solution in original post

dxu_splunk
Splunk Employee
Splunk Employee

These are likely pre-multisite buckets. (you can tell by going to the cluster master endpoint /services/cluster/master/buckets?filter=replication_count<3, and note that the buckets there have constrain_to_origin_site = 1)

see answers question / docs

you can try setting replication_factor=2 in the cluster master server.conf, and restart the master - this should then show rf/sf met. (we don't replicate pre multisite buckets across sites, so since you have 2 indexers per site, setting replication_factor=2 makes sense)

darshan_singh01
Path Finder

thanks dxu_splunk ...Even I also feel that these are the pre multisite clustering buckets .. The replication of main index does not give any error however 4 buckets each of audit /_internal index are not replicating .

As you responded we should keep replication factor=2 .But in that case if any disaster happens at site 1(say site 1 goes down) and if all the replicated buckets are only residing at site 1 since there are only two replicated buckets (site replication factor=2) then how would be the disaster recovery can happen ?

your help would be appreciated ..

0 Karma

dxu_splunk
Splunk Employee
Splunk Employee

replication_factor only affects non multisite buckets (those 4+4 buckets you mentioned). we do not replicate them across sites (if the source bucket is on siteA, it'll be replicated within siteA), so they do not have site disaster recovery. for your actual data that came in after multisite, those buckets/data follow site_replication_factor/site_search_factor, so all should be well.

darshan_singh01
Path Finder

It worked 🙂 🙂

Thanks a Lot dxu_splunk & esix_splunk ..

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

You will need to keep the replication_factor and search_factor configurations under the clustering option. This is legacy support for local, non-clustered, indexes before multisite is configured.
It will have no effect on multisite replicated buckets, as these use the site_* configurations.

This is a known issue.

Sourabhv05
Communicator

Hi Darshan
Try rolling hot buckets by running following command

splunk _internal call /data/indexes/_audit/roll-hot-buckets -auth admin:changeme

Wait for some time and see if problem gets resolved

darshan_singh01
Path Finder

nopes ..same state ...

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...