Screen avatar image
Screen asked Erick Ramirez edited

What is the correct order to restart a cluster for point-in-time restore?

I have a mixed workload cluster across multiple datacenters. I have ran the sstableloader command for the tables I want to restore using snapshots which I had backed up. I have added commit log files which I had backed up from archive to a restore directory on all nodes. I have updated the file with these configs.
What is the correct way and order to restart nodes of my cluster?
Do these considerations apply for restarting as well?

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

As a general rule, we recommend restarting seed nodes in the DC first before other nodes so gossip propagation happens faster particularly for larger clusters (arbitrarily 15+ nodes). It is important to note that a restart is not required if you restored data using sstableloader.

If you are just performing a rolling restart then the order of the DCs does not matter. But it matters if you are starting up a cluster from a cold shutdown meaning all nodes are down and the cluster is completely offline.

When starting from a cold shutdown, it is important to start with the "Analytics DC" (nodes running in Analytics mode, i.e with Spark enabled) because it makes it easier to elect a Spark master. Assuming that the replication for Analytics keyspaces are configured with the recommended replication factor of 3, you will need to start 2 or 3 nodes beginning with the seeds ideally 1 minute apart because the LeaderManager requires a quorum of nodes to elect a Spark master.

We recommend leaving DCs with nodes running in Search mode (with Solr enabled) last as a matter of convenience so that all the other DCs are operational before the cluster starts accepting Search requests from the application(s). Cheers!

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.