Using Indexer Clustering 7.0.4 and after some network outage issues I am getting 8600+ buckets with below pending fixup status for more than 2 days.
Cannot fix up primary restoration as previous mask change is in progress; Previous mask change in progress;
Any idea of the root cause and how to fix it ?
If you are seeing many deleteBucket messages as the below one in your cluster peer's splunkd.log then you might be hitting SPL-146575 which is fixed in 6.6.8, 7.0.5, 7.1.2 and 7.2.0
INFO CMSlave - deleteBucket bid=xxx~111~yyyyy, frozen=true
The problem here is that there is a mismatch between what buckets the cluster master thinks cluster peers have and what cluster peers actually have.
If you can not upgrade to 7.0.5 or later to get the fix then please make sure all your cluster peers are up and running without any issues and then restart your cluster master to workaround the issue.
Restarting Cluster Peers will fix the issue.
If you are seeing many deleteBucket messages as the below one in your cluster peer's splunkd.log then you might be hitting SPL-146575 which is fixed in 6.6.8, 7.0.5, 7.1.2 and 7.2.0
INFO CMSlave - deleteBucket bid=xxx~111~yyyyy, frozen=true
The problem here is that there is a mismatch between what buckets the cluster master thinks cluster peers have and what cluster peers actually have.
If you can not upgrade to 7.0.5 or later to get the fix then please make sure all your cluster peers are up and running without any issues and then restart your cluster master to workaround the issue.