Valuser avatar image
Valuser asked Erick Ramirez answered

Do I need to run sstableloader on snapshots from all nodes of the source cluster?


I have a snapshot of 5 node cluster . Each of the these are in a seperate folder meaning there are 5 folders each representing a node.inside each folder , there is the keyspace k1 and there are 2 tables t1,t2 inside of it. I am using sstableloader to restore them to another cluster. For restoring i have to go through each of the 5 parent node snapshot folders, then for each table, i apply sstableloader command.

So totally i have to run sstableloader command 10 times. Can i reduce the number of command runs with either supplying keypsace as the sstableloader argument or supply all the tables as list?..

Or is there another option?assandra

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

It is not possible to just specify the keyspace or load multiple tables with a single command. The sstableloader utility is designed to load data to a single table which is why you need to specify the path to a table's files when you run it:

$ sstableloader -d [ip1,...] [options] /path/to/ks_name/table_name/

On the subject of having 5 separate snapshots from 5 different nodes, technically it is possible to place all of them in the same directory BUT we don't recommend it -- if you have files with the same generation number (e.g. -5678-) from 2 different nodes, you will end up copying over the files.

It is safer to keep the snapshots in separate folders for this reason. You can also increase the load throughput if you run multiple instances of sstableloader in parallel provided your cluster can take the additional load. Cheers!

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.