We're adding a new datacenter to our Cassandra cluster. Currently we have a 15-node DC with RF=3 resulting in about 50TB~ of data.
We are adding another datacenter in a different country and we want both datacenters to contain all the data. Obviously, synchronising 50TB of data across the internet will take a gargantuan amount of time.
Is is possible to copy a full back to a few disks, ship that to the new DC and then recover? I'm just wondering what would be the procedure to do so.
- Could someone give me a few pointers on this operation, if possible at all?
- Or any other tips?
Not sure if it matters but our new DC is going to be smaller (6 nodes) for the time being, although space will be available. The new DC is mostly meant as a live-backup/failover and will not be the primary cluster for writing, generally speaking.