Deployment Architecture

How to stop a cluster?

jtworzydlo
Path Finder

I'm having 2 clusters in my Splunk environment located on 4 hosts.
Due to some patching the hosts need to be restarted and I need to make sure splunk clusters go down safely and after the restart they start properly.
How do I need to do that?

Tags (1)
0 Karma

linu1988
Champion

I think you can go for stopping the instances one by one, And then use the rolling-restart command to start all of the peers. I also couldn't find anything about the stopping from master node.

0 Karma

ofrachon
Path Finder

Have you read this page ?
http://docs.splunk.com/Documentation/Splunk/5.0.3/Indexer/Restartthecluster

Everything standard is explained there.

0 Karma

ofrachon
Path Finder

Ok. gfuente has written some nice tips up there, and you can also find the proper order for upgrading Splunk here : http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Upgradeacluster and that will help you a lot regarding your project of patching up the servers themselves.

jtworzydlo
Path Finder

4 hosts, 2 clusters on 4 hosts, each cluster on 2 hosts. Each cluster: first host - cluster master + peer, second host - peer + search head.
The forwarders switch the indexers every 60 seconds.
My rep factor = 2, search factor = 2.
I do not need to put all the hosts at once down, I can do that sequentially. I can easily put down the hosts with peer+search_head down, but I do not know how to put down the host with the cluster master so the cluster starts working properly after starting cluster master again.

0 Karma

ofrachon
Path Finder

Well then, it all depends on how your forwarders are configured, what are your rep and search factors, and how the various functions are split across your 2 hosts : master node ? search head ? indexers ?

0 Karma

jtworzydlo
Path Finder

Yes, the restart is described there, but unfortunately my case is not a restart.
I need to stop the entire cluster (4 instances on 2 hosts), then the hosts are patched and restarted, and after that I need to start the whole cluster again. The patching might take few hours.

0 Karma

gfuente
Motivator
0 Karma

gfuente
Motivator

Then probably you should, use the offline command in each peer, restart it, then do the same in the other peers. Once all the peers have been restarted you only need to restart the master, in this case probably you should kill the master node splunk process. The cluster will continue working without the master node. Then restart the master and when it comes back online it will sychronize with the peers.

I think this will work.

jtworzydlo
Path Finder

Yes, the restart is described there, but unfortunately my case is not a restart.
I need to stop the entire cluster (4 instances on 2 hosts), then the hosts are patched and restarted, and after that I need to start the whole cluster again. The patching might take few hours.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...