question

pranali.khanna101994_189965 avatar image
pranali.khanna101994_189965 asked Erick Ramirez answered

How does replication work with multiple data centers?

Suppose I have two Datacentres with DC1 nodes =3 RF=2 , DC2 nodes=4 RF=2. since my data is distributed all across nodes in the cluster.

what if DC1 fails / or is down as RF=2 for DC1 so data present in DC1 is not replicated to any node in DC2? so how we will recover data present in DC1?

secondly, can it happen that when i define RF=2 for DC1 it will find replica within same DC. can replicas be present across DC also?

I mean if i define RF=2 for DC1 can one replica be there in DC2 or all be in DC1 only?

if yes, then how ?

replication
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

bettina.swynnerton avatar image
bettina.swynnerton answered pranali.khanna101994_189965 commented

Hi @pranali.khanna101994_189965,

in a multi-datacenter setup, we recommend the use of a datacenter-aware replication strategy, NetworkTopologyStrategy.

The replication strategy per keyspace is then very clearly defined, so you know exactly how many replicas you should have per datacenter.

For example:

CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2', 'DC2': '2'}  AND durable_writes = true;

If a node is down, other nodes will store hints for the node as per hint window (per default 3 hours), and if the node is down for longer, the data will need to be synced by a repair process.

For resilience against datacenter outages, you will need to define your replication and consistency levels so that reads and writes can be served by one datacenter only, in case of failure.

More about replication here:

https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/architecture/archDataDistributeReplication.html

I hope this helps!

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

pranali.khanna101994_189965 avatar image pranali.khanna101994_189965 commented ·

Thanks for the reply ! just a small query here ...as mentioned i had two DC'S and one is down and lets say i want to read some data that data is present on one of the nodes in DC1 and supposedly replicas are also in DC1 which means if DC1 is down even , all replica nodes for that data is down . so it will not be able to satisy any consistency level ?


what will happen in this case. as nodes in DC2 does not have any information about the data DC1 was having

0 Likes 0 ·
Erick Ramirez avatar image
Erick Ramirez answered

To add to Bettina's answer and to specifically answer this question:

what if DC1 fails / or is down as RF=2 for DC1 so data present in DC1 is not replicated to any node in DC2? so how we will recover data present in DC1?

Data is only available in the DCs where there are replicas. In your case, there is no recovery point if DC1 is down because you've configured your keyspace(s) to only have replicas in DC1. DC2 is irrelevant in this scenario because it has no copy of the data at all.

secondly, can it happen that when i define RF=2 for DC1 it will find replica within same DC. can replicas be present across DC also?

No, it cannot. If you configured your keyspace to NOT have any replicas in DC2, there will be NO replicas in DC2. Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.