Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

spiragua_132892 avatar image
spiragua_132892 asked Erick Ramirez commented

What is the procedure for recovering a failed VM whose data is intact?

Hi, Customer wants to test a recovery scenario like this:

Cassandra node´s data resides in a mounted volume (not local disk). Node´s host (VM) fails, but data remains OK in the device. Host is recovered using a two hours ago VM snapshot (NOT a Cassandra´s snapshot) of the original host, same Cassandra config files, same host IP, same filesystems are mounted (including node´s data directory), all the same.

We think that for recovery on this scenario, we only need start node up on the recovered host and all must work well after a while, node on recovered host will receive pending hints, additionally we can do a nodetool repair on the node to force sync of pending mutations (primary range only?) after start it.

Are we right ?? Or are we missing something ?

disaster recovery
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

The procedure is the same as recovering a physical machine that has completely failed provided the data disk is intact -- just hot-swap the data disk into an identical physical server and the node should be back into operation.

In your case, you need to mount the data disk on another VM then run a full repair on the node (a nodetool repair without any flags). Note that this will work on any VM with Cassandra installed and configured the same as the failed VM -- it does not require (a) to have the same IP address as the failed node, or (b) be restored from a VM snapshot.

As a final note, make sure that you also other volumes such as the commitlog and other Cassandra directories mounted on a different volume. Cheers!

3 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks Erick ! Is good to have a second opinion to validate our thoughts !. We can share with community that we proceeded as you explained and our recovery test scenario was successful !.

0 Likes 0 ·

Glad to hear. Cheers!

0 Likes 0 ·

We did the recovery test successfully following the procedure described. After we recovered host and Cassandra node we did a full node tool repair on node running on recovery VM.

0 Likes 0 ·