Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

penky28_147901 avatar image
penky28_147901 asked penky28_147901 answered

What are the steps for reducing vnodes in a production cluster?

Hi All,

We are currently running on Apache Cassandra 3.11.3 version. We are planning to upgrade to 4.0.

I see that Cassandra 4.0 recommends to use less number of vnodes. Can anyone please tell me high level of steps to reduce the virtual nodes on production cluster.

on 3.11.3, num_tokens = 256

Proposed to use num_tokens = 8 on 4.0 version.

One way I can think of is by creating new datacenter with new Vnodes and migrating the data to new datacenter. Once complete decommission the existing one. This involves extra hardware which involves additional cost.

Can you suggest any other approach for this activity. Downtime is acceptable.

5 node cluster, each node size is ~ 200 G. Only 1 datacenter is being used.

-Raghav

virtual nodes
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

It isn't possible to change the virtual nodes (num_tokens) once a node is part of a cluster. You can only set it for new nodes.

There are 2 possible options to implement this:

  1. Add a new DC with new vnode configuration then decommission old DC.
  2. Add new nodes with 8 tokens to the DC and decommission old nodes.

For option 1, the procedure is the same as switching from single-token nodes described in Enabling virtual nodes on an existing cluster.

For option 2, you could implement it whichever way suits you. For example:

  1. Install/configure C* on new server with num_tokens: 8.
  2. Add node to the DC.
  3. Decommission one of the existing nodes in the DC.
  4. Completely wipe contents of data/, commitlog/, saved_caches/ subdirectories.
  5. Reconfigure node with num_tokens: 8.
  6. Add node back to the DC.
  7. Repeat steps 3 to 6 above until all nodes have been reconfigured.

WARNING: Make sure when adding a seed node back into the cluster that it doesn't have its own IP in its seeds list or it will not bootstrap any data when it joins the cluster. Cheers!

2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

thanks Erick.


I was wondering if we can follow the below steps

1. Execute nodetool snapshot on each node

2. Copy snapshots to different drive(/backup) for each node

3. remove data and start Cassandra with 8 vnodes (num_tokens: 8)

4. copy data from /backup drive to each node

5. Start cassandra

Let me know if above approach works.

-Raghav


0 Likes 0 ·
Nope, or I would've given it as an option. :)

The data in the SSTables are only valid for the tokens that the node owns at the time. Once you change token assignments, you've made the data in the SSTables unreadable resulting in data loss. Cheers!

0 Likes 0 ·
penky28_147901 avatar image
penky28_147901 answered

Thanks Eric.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.