Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

davi_prosesor2008_192135 avatar image
davi_prosesor2008_192135 asked ·

What is the best practice for multi-datacenter clusters in Cassandra?

hi everyone, i am a newbie in cassandra. rightnow i have been coding small application and connect to cassandra. but i need to know how to configure and setup cassandra clustering if i have 6 nodes in different data center, 3 in dc and another in drc? i already followed datastax instruction but sometime i didnt receive respose from cassandra when i used rf 5 or getting consistency problem when i used rf 3. may be someone could help me to give me best configuration?

cassandra
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered ·

I don't quite understand your question so I'm going to try my best in providing an answer.

Multi-DC clusters are common place in Cassandra clusters in lots of organisations around the world. You need to configure your keyspaces to use NetworkTopologyStrategy and we recommend you configure 3 replicas in each DC. For example:

CREATE KEYSPACE community WITH REPLICATION = { \
  'class' : 'NetworkTopologyStrategy', \
  'datacenter1' : 3, \
  'datacenter2' : 3, \
};

We recommend a strong consistency level of LOCAL_QUORUM for both reads and writes. With 3 replicas in each DC, your application can tolerate a node outage with LOCAL_QUORUM queries.

If you update your original question with details of the "consistency problem" you experienced, I'd be happy to address it.

You can find out more information in the following documents:

Cheers!

1 comment Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi Erick, thank you for answer and advice. yesterday i used simple strategy, if i see in your answer, i need to change to be newtork topology. i will do that and see what the impact. btw i have some question related your answer:
1. in clustering, what the best value should be set for timeout?

2. where do i should set local qourum consistency? in client which is my app , or in database side?

thank you

0 Likes 0 · ·
Gangadhara M.B avatar image
Gangadhara M.B answered ·

HI Davi,

1) For MultiDC set it's always recommended to go with NetworkTopologyStrategy if it's an on premise set , If you are deploying multi DC cluster then you need to choose appropriate snitch example for deploying multiDC cluster on AWS EC2 choose endpoint_snitch: Ec2Snitch .

2) If you are changing Topology , then need to run repair on those key space to avoid data in different format .

3) Default write_request_timeout_in_ms: 2000 and counter_write_request_timeout_in_ms: 5000 , try to look for tuning whatever possible from application program side, networking , I/O , CPU etc rather than directly trying to tweak DB side parameters in cassandra.yaml , if you really want to change any parameters in Cassandra side change one parameter at time see the change in result then look for any other parameter .

In some use cases where you have high end servers having more number of CPU , Memory , High I/O throughput storage sub system , high N/W speed b/n the nodes etc ,that time it sense to consider changing any parameters and see the delta result/performance improvement

4) Consistency level set/specified at application/program level not at Cassandra level , if not consistency set at any where then default is ONE

5) As Erick mentioned you need to upload more log content to see what is the actual reason for timeout , Erick is the master he can answer you further after seeing detailed log ,


2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi Gangadhara,

when i change topology from simple to network and do repair i didnt receive response from server and the session is timeout, in here i attach the system.log. i also get timeout when do select count in 1 table.


0 Likes 0 · ·
0 Likes 0 · ·
system.zip (389.4 KiB)
smadhavan avatar image
smadhavan answered ·

@davi_prosesor2008_192135, please also refer to this documentation about initializing the cluster. Read and write consistency levels are set at the client side, for e.g at your application. If you could elaborate on what do you mean by “clustering timeout” by updating your post, I could get back with some pointers. Thanks!

2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

thank you for information, i meant when i connect to cassandra cluster i always get timeout error. so maybe there is timeout configuration need to be set. should i config it? and what the best value for that?


this is the error:

com.datastax.driver.core.exceptions.OperationTimedOutException: [/119.114.193.29:9042] Timed out waiting for server response

at com.datastax.driver.core.exceptions.OperationTimedOutException.copy(OperationTimedOutException.java:43)

at com.datastax.driver.core.exceptions.OperationTimedOutException.copy(OperationTimedOutException.java:25)

at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:35)

0 Likes 0 · ·
smadhavan avatar image smadhavan davi_prosesor2008_192135 ·

@davi_prosesor2008_192135, I don't think that's the full stack trace of the error. OperationTimedOutException gets thrown when the driver (at the client side) doesn't get a response from the nodes. Where you able to check the system.log around the same exact time to make sure what was happening in the cluster? It could very well be that you've an overloaded cluster that is unable to process requests. Although a bit old, this is still a great blog which covers how error handling is done.

0 Likes 0 · ·