Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

Gangadhara M.B avatar image
Gangadhara M.B asked ·

Can we set different num_token values for new nodes to eventually replace old nodes?

HI Team,

We have 06 node existing DSE 5.1.11 cluster with AWS EC2 C5.4xlarge .Now business is asking us to add 09 new AWS EC2 I3.2xlarge to existing cluster and decomm all old of nodes C5.4xlarge .

Question is :-

1) Currently cluster is running with "num_tokens: 256" on all EC2 C5.4xlarge , when I am adding new nodes of type I3.2xlarge to the cluster can I set "num_tokens: 8" ? only on I3.2xlarge.

2) Can cluster run with different set of nodes running different set of "num_tokens:xxx" for short duration ?. After few days we consider to decomm all old nodes of type C5.4xlarge

Thanks

Gangadhara

virtual nodes
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered ·

Yes, it is absolutely fine for nodes in a cluster to have differing numbers of virtual nodes. The typical use case for allocating different num_tokens is when you have non-identical hardware for nodes in the same DC.

An example scenario is where existing nodes have 8 cores + 64GB RAM with num_tokens: 8. New nodes with more powerful hardware -- 16 cores + 128GB RAM -- can take on twice the load of existing nodes. In this case, it make sense to allocate twice the number of tokens so set num_tokens: 16.

I've discussed this in a bit more detail in this post -- Deploying nodes in a cluster with different hardware configuration.

To respond to your questions directly:

  1. Yes, we recommend 8 tokens when deploying clusters with virtual nodes. Both 8 and 16 are a good choice and 256 is no longer recommended.
  2. Yes, it is not a problem at all particularly since it is just going to be temporary.

On a side note, I'm a huge fan of i3.2xlarge instances (8 cores + 61GB RAM + 1.9TB NVMe SSD). You will find that they will perform significantly better than the c5.4xlarge instances (16 cores + 31GB RAM + EBS only) they're replacing. Nodes are more likely to be IO-bound than CPU-bound so the NVMe SSDs on i3 instances are going to provide a much higher throughput. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

jclu3_36507 avatar image
jclu3_36507 answered ·

Hi Gangadhara:

You probably don't want to add 9 i3 nodes each with num_token of 8 to an existing DC of 6 c5 nodes each with num_token of 256. Because if you did that, you will end up with a DC of 15 nodes with a total of 1608 tokens (9x8 + 6 x256) that is lopsided in terms of load in that the 9 i3 node will handle less than 1/2 of the load. RF makes this calculation a bit difficult, but the resulting DC will not be balanced.

A better idea would be to build another DC using the i3 nodes in the same cluster as the C5 nodes, then rebuild that i3 DC with data from the c5 DC. This way data distribution is even on both DCs and load handling remains undisturbed . And when you are satisfied with the Data distribution and accuracy, change the driver configuration on your app-servers to point to the i3 DC.


Hope this helps.

4 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Got it , Thanks lot for your update .

0 Likes 0 ·

Thanks for being a part of the community. We value everyone's contribution.

I do want to point out that the number of tokens a node owns doesn't have a bearing on the replication factor. Ultimately, there isn't an issue with heterogeneous virtual node configuration provided it is done correctly. Cheers!

0 Likes 0 ·
jclu3_36507 avatar image jclu3_36507 Erick Ramirez ♦♦ ·

Hi Eric: Thanks for writing back. So a quick follow up: , if you have a 3 node cluster, one node (say Node A) with num_token of 256, the other 2 (Say B and C) with num_token of 8, then how does RF of 3 work? How many replicas will nodes B and C hold, in terms of of token ranges?


thanks.

0 Likes 0 ·

I think you are confusing replication factor with tokens.

num_tokens determine how many virtual nodes exist on a node. Each of those virtual nodes is a replica. Cheers!

1 Like 1 ·