Why Cluster Peer (Indexer) takes long time to start splunkweb when Cluster Master is down
In my test environment, I was cleaning up all data in Indexer Clustering Peers.
Because there is no splunk utility to clean up all index db in Indexer Clustering environment in the current version ( v6.2.2 ), I did the following steps.
Version: 6.2.2
CM: Cluster Master
CP: Cluster Peer
Start CPs
Start CM
If CM is up, starting a CP does not get does not take time at "Waiting for web server at http://127.0.0.1:55110 to be available...." and starts within five sec.
It seems like splunkweb got stuck until CP is connected to CM.
So the best practices here are not to stop the cluster master, but to put it into maintenance mode. This will prevent bucket fixup and cluster rebalancing processes from being run.
After this, you can stop the individual peers, clean the indexes and restart them. Once this is completed across the cluster, you should take the master out of maintenance mode.
When the peers start, they are attempting to reach out to the cluster master, register, and get peer information in order to meet the CLuster's Search and Replication factors. So yes, with out the cluster master up and running, it will take time for the peers to start.
During the start process, you can monitor the splunkd.log file (in $splunk_home$/var/log/splunk/splunkd.log) and see its connection attempts to the Cluster Master.
Thanks for your advice.
I wanted to make sure when CM is down this slow start-up happens as expected.