question

azim_91_184236 avatar image
azim_91_184236 asked Erick Ramirez commented

What is the recommended way for migrating a large on-premise cluster to the cloud?

What is the best way to migrate data (in the Cassandra clusters) from OnPrem to Cloud with zero downtime migration requirement and large data size (100's of TBs)? I have looked into a few docs and posts like -

1. Expand the OnPrem data center -

https://docs.datastax.com/en/ddac/doc/datastax_enterprise/operations/opsAddDCToCluster.html

2. These posts -

https://community.datastax.com/questions/2524/migrating-application-to-new-cluster.html

http://www.redshots.com/moving-cassandra-clusters-without-downtime-part-1/

The last blog in redshot talks about the challenge of network bandwidth a little bit and I would like to understand if this pattern of 'Adding the new DC (Data Center) to the cluster and let the data stream to new DCs via replication' is suitable for 100's of TB data in the Cassandra cluster.

Is there any other option/pattern for this scenario (zero downtime, large data migration) such as, taking the data via offline methods (backup/restore) to the nodes in the new DC and let the repair/replication catch up, in cases where huge data size and network latency is a concern?

I would appreciate any guidance on this.

cassandramigration
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

@azim_91_184236 Adding a new data centre on the cloud is the recommended approach for migrating an on-premise cluster regardless of size. Some of the things you need to consider are:

  • set up a dedicated high-bandwidth network to the cloud like AWS Direct Connect
  • consider temporarily increasing the capacity of the on-premise DC by adding new nodes

These 2 things are what you need to do to address the concerns you have. This is not a valid avenue:

Is there any other option/pattern for this scenario (zero downtime, large data migration) such as, taking the data via offline methods

You can't just copy the data to a new cluster on the cloud, then later try to "join" them as if they were 2 DCs in one cluster because they are 2 distinct clusters with their own distinct schema versions. If you attempt to "join" them, it will cause a schema disagreement and data will not be accessible for both clusters. Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

azim_91_184236 avatar image azim_91_184236 commented ·

Thanks @Erick Ramirez for the suggestions and guidance!

Just to clarify on this - "You can't just copy the data to a new cluster on the cloud, then later try to "join" them .." - we are not intending to create a new cluster on the cloud, rather a DC with a set of nodes that will contain the same data, with similar number of nodes and vnodes configuration as OnPrem. So rather than having empty nodes join the cluster, wanted to see if the nodes with data (back-ed up and restored from OnPrem nodes) can join the existing cluster. Thoughts?

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ azim_91_184236 commented ·

@azim_91_184236 The token assignments (of vnodes) on the cloud DC will not be identical to the tokens on the on-premise nodes -- they are randomly calculated/allocated. This means that you won't just be able to copy the backup data from on-premise nodes onto the cloud nodes because the data in those SSTables won't necessarily be owned by the nodes you've copied them to. Cheers!

0 Likes 0 ·