Valuser asked

How do I restore from a multi-node cluster?

Hi. I am trying to restore from a 3 node cluster to 2 node /one node cluster. I have taken the snapshot of the 3 node cluster ( its has 3 tables).

The snapshot is in a directory called cassandra/data/keyspace1/table1, cassandra/data/keyspace1/table2.. and similarly one more .

I am using sstableloader to restore them ( i can only use that) . Can i restore that entire keyspace to my target cluster? Or should i do it table by table? What is the command for either of the above options?

Will restoring via sstableloader overwrite some of the existing tables if present in the keyspace or skip them?


1 Answer

Erick Ramirez answered

The Cassandra bulk loader utility sstableloader loads table snapshots to a cluster by streaming relevant parts of SSTables to destination nodes. In this context, "relevant parts" means the data which belongs in the token range(s) owned by the destination nodes.

To respond to your questions directly:

  • Each sstableloader instance can load data one table at a time.
  • The streamed data will only overwrite existing data if they are newer. Recall that reads return only the latest version of the data based on the write timestamp.
  • The utility does not read data -- it simply streams the SSTables to the destination nodes.

To run the utility:

$ sstableloader -d ip1,ip2,ip3 [options] /path/to/ks_name/table_name

For full details, see Cassandra bulk loader. Cheers!

