Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

tshende avatar image
tshende asked ·

Can we have 5 nodes in 5 racks with RF=3?

I have an existing production application which is deployed on 5 racks; our application is sized for two rack failure. Now we are adding cassandra as new technology into our application.

I am thinking of going with 5 node cassandra cluster where each rack will have one node, all nodes will be active and will be part of cluster. The replication factor is 3 in our case and we sized the nodes appropriately so we can survive two node/rack failure.

The question is what is the better topology for this requirement?

1. 5 nodes on 5 racks

2. 6 nodes on 3 racks

What would be some of the design consideration I should be made.

topology
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

As a general rule when you want to place nodes in Cassandra racks, we recommend the same number of racks as the replication factor. We also recommend that the same number of nodes in each rack to make sure that data distribution is balanced. Otherwise, a single-rack configuration is sufficient for most environments.

The recommendation in production environments is to have 3 replicas in each DC so you need to have 3 racks. If this is not possible, you should revert your configuration to just one rack for the whole DC.

So specifically on your question on "5 nodes in 5 racks" versus "6 nodes in 3 racks", the latter is preferred since it satisfies the recommendation for (1) the number of racks equal to the replication factor, and (2) same number of nodes in each rack. Cheers!

2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

@Erick Ramirez thanks for quick response.

Can you please share rational behind "the same number of racks as the replication factor" recommendation? Any article/links on this would be helpful.


Here are few advantage of "5 nodes on 5 racks"

1. Need one server less (Cost efficient)

2. All application racks are of same capacity

3. If one rack goes down, I am losing only 20% capacity; two racks down I am losing 40% capacity. (In 6 node, 3 rack design if one rack goes down, I will lose 33%, if two racks goes down I will lose 66% capacity)

0 Likes 0 · ·

Here are few advantage of "5 nodes on 5 racks"

There isn't a nice way of distributing 3 copies of data (RF=3) among 5 racks. This means that there won't be an equal distribution of data load across the 5 nodes.

Need one server less (Cost efficient)

You choose Cassandra because you have a scale problem and want high availability. It isn't a question of cost.

All application racks are of same capacity

This isn't relevant.

If one rack goes down, I am losing only 20% capacity; two racks down I am losing 40% capacity.

This is incorrect. With RF=3, you have 3 replicas. If you lose any node/replica, you lose a third -- 1 of 3 replicas. Your idea of racks doesn't reflect how Cassandra distributes copies of data across replicas. Cheers!

1 Like 1 · ·