DataStax Academy FAQ

DataStax Academy migrated to a new learning management system (LMS) in July 2020. We are also moving to a new Cassandra Certification process so there are changes to exam bookings, voucher system and issuing of certificates.

Check out the Academy FAQ pages for answers to your questions:


question

rashokkumartce_193569 avatar image
rashokkumartce_193569 asked ·

How is data split across nodes in DC when a new node joins the cluster?

when the new node joins the cluster

1. will there be a change in the token ranges for each node? For eg) if node A is responsible for 1-10 partition token ranges , and after the new node is joined , will it be given responsibility of handling only 1-8 token ranges ?

2. If the above one is true, won't it be an expensive operation as data will be shifted to to other in every nodes? As a result will there be down time ?

cassandra
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

The token assigned to a node determines the token range (and its underlying data) the node owns. It also determines it's position around the ring (data centre).

Consider this example 3-node cluster where the yellow node at the top owns the token range shown as yellow:

When a new light-blue node is added whose token bisects the yellow token range, the light-blue node takes ownership of half the data that the yellow node owned (section of the ring shown in light-blue):

To answer your questions explicitly:

  1. Yes, the token you assigned to the new node will change the token range of the adjacent node in the ring.
  2. Yes, portions of the data that another used to own will be streamed to the new node when it bootstraps. But no, there is no downtime. Cassandra is an always-on database. There aren't any operation in C* that requires downtime -- even upgrades.

There are additional details in How data is distributed across a cluster.

If you haven't done them already, I recommend the DS201 Cassandra Foundations course which explains these concepts in detail at DataStax Academy. Cheers!


Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.