DataStax Academy FAQ

DataStax Academy migrated to a new learning management system (LMS) in July 2020. We are also moving to a new Cassandra Certification process so there are changes to exam bookings, voucher system and issuing of certificates.

Check out the Academy FAQ pages for answers to your questions:


question

nicholasamorim avatar image
nicholasamorim asked ·

Shipping Disk to New DC in order to sync 50TB of data

We're adding a new datacenter to our Cassandra cluster. Currently we have a 15-node DC with RF=3 resulting in about 50TB~ of data.

We are adding another datacenter in a different country and we want both datacenters to contain all the data. Obviously, synchronising 50TB of data across the internet will take a gargantuan amount of time.

Is is possible to copy a full back to a few disks, ship that to the new DC and then recover? I'm just wondering what would be the procedure to do so.

  • Could someone give me a few pointers on this operation, if possible at all?
  • Or any other tips?

Not sure if it matters but our new DC is going to be smaller (6 nodes) for the time being, although space will be available. The new DC is mostly meant as a live-backup/failover and will not be the primary cluster for writing, generally speaking.

clusterdatacenter
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

@nicholasamorim it isn't really possible to copy (or restore the files) from one DC to another because each partition (the equivalent of records in RDBMS) is going to be distributed differently between the existing DC and the new DC. For example, partition XYZ which is owned by node J in the source DC could be owned by node B in the new DC.

If you were building a completely separate cluster in another country, then it is possible to bulk load a snapshot to the new cluster using the sstableloader utility. The steps are documented in Restoring from a snapshot. WARNING - This method is NOT valid if the new DC is part of the same cluster as the existing (source) DC. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.