question

sboreddy avatar image
sboreddy asked Erick Ramirez commented

Why is my data not replicated to DC2?

Hi,

I am new to Cassandra and trying to do a POC (Proof of concept) for multi-DC replication.

I have deployed Cassandra on GCP using Market place (click to deploy) solution on 2 DC's. Each DC have 2 nodes on it (Dc1-2 nodes,DC2-2 nodes). Modified cassandra.yaml file for seed nodes

Added 1 node from each DC as seed nodes. Updated system keyspaces replication factor as below:

ALTER KEYSPACE "system_auth" WITH REPLICATION ={'class' :'NetworkTopologyStrategy', 'us-east1_b' : 2, 'us-east1_c': '2'};
ALTER KEYSPACE "system_distributed" WITH REPLICATION ={'class' :'NetworkTopologyStrategy', 'us-east1_b' : 2, 'us-east1_c': '2'};
ALTER KEYSPACE "system_traces" WITH REPLICATION ={'class' :'NetworkTopologyStrategy', 'us-east1_b' : 2, 'us-east1_c': '2'};

And created 1 more Keyspace for replication:

CREATE KEYSPACE "first" WITH REPLICATION ={'class' :'NetworkTopologyStrategy', 'us-east1_b' : 2, 'us-east1_c': '2'};
endpoint_snitch: GoogleCloudSnitch

Nodetool status command on node1 of DC1 node it is giving output as below:

poc-cassandra1-db-vm-0:~$ nodetool status
Datacenter: us-east1_b
====================================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address     Load       Tokens Owns (effective) Host ID Rack
UN 10.20.40.12 214.18 KiB 256    100.0%      00d096a5-c4a2-4a25-ae0f-e2255afa1839 b
UN 10.20.40.11 232.27 KiB 256    100.0%      a31c790b-f892-4f26-b3dd-c93f2639a19f b

when a table is created and inserted few rows on node1 of DC1, Same data reflected in node2 of DC1 but the it is not replicated to DC2 nodes. Please help me to understand what is the issue.

Thanks,

Sandhya

replication
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

bettina.swynnerton avatar image
bettina.swynnerton answered Erick Ramirez edited

Hello @sboreddy,

from your nodetool status output we can see that your nodes do not see the whole cluster, just their own DCs.

If you are using multiple datacenters, you will also need to configure the cassandra-rackdc.properties file.

See here for the configuration details:

https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/architecture/archSnitchGoogle.html

You will need to restart your nodes after making these configuration changes, then check your nodetool status again to see if your nodes see the whole cluster.

I hope this helps!

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

sboreddy avatar image sboreddy commented ·

Thanks Bettina for your Response. I have Modified cassandra-rackdc.properties file already with dc_suffix parameter as below:

for Zone1 (2 nodes):

dc_suffix=_southcarolina_b

For Zone2 (2 nodes):

dc_suffix=_southcarolina_c

Restarted Cassadnra on all nodes but still nodetool status output is same .

0 Likes 0 ·
Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

I need to reiterate Bettina's main point -- the nodes are not communicating with each other and have not formed a cluster based on the nodetool status output. You need to make sure that you have a valid cluster for replication to work.

For nodes to form a cluster, then need these three things:

  1. the same cluster name,
  2. at least one common seed, and
  3. network connectivity.

In your case, you need to make sure that all nodes have the same cluster_name in cassandra.yaml. Check that for incorrect spelling or different capitalisation.

We recommend configuring at least one seed from each DC but preferably two seeds from each DC in case one of the seeds is unavailable. I'd also recommend using the same seeds list on all nodes in all DCs.

In your case where there are only two nodes in each DC, you need to specify all four nodes as seeds.

You will also need confirm that all nodes can communicate with each other across DCs on their private IPs. You need to configure each node with:

listen_address: private_ip
rpc_address: public_ip

Note that if you're running DSE 6+, rpc_address was replaced with native_transport_address (DB-1130). Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

sboreddy avatar image sboreddy commented ·

Thanks Erick for your comments.

1. Cluster name is same in all 4 nodes

2. Added 4 internal IP's in seeds on all 4 nodes.

3. updated rpc_address as 0.0.0.0 and broadcast_rpc_address to external I/P of corresponding node.

Please note that when I have updated rpc_address as external/public IP in cassandra.yaml file nodetool status was giving error as "Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection refused (Connection refused)' "

After all these updates also still same output for nodetool statsu (only 2 nodes from same zone/rack are communicating but not with other 2 nodes of another zone/rack)


Firewalls opened are: 7000-7001,7199

All the 4 nodes are in same subnet. Kindly suggest.


Thanks,

Sandhya


0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ sboreddy commented ·

Please note that when I have updated rpc_address as external/public IP in cassandra.yaml file nodetool status was giving error as "Failed to connect to '127.0.0.1:7199'

That issue is unrelated to rpc_address. JMX port 7199 is bound to localhost unless you override it in cassandra-env.sh. It indicates to me that you have something else configured incorrectly.

The only changes you need to cassandra.yaml are:

  • cluster name
  • seeds list
  • listen address (for gossip on private network)
  • RPC address (for client access on public IP)

The broadcast address is only required for multi-region clusters where there's no connectivity across DCs over the private network. Cheers!

0 Likes 0 ·