question

pranali.khanna101994_189965 avatar image
pranali.khanna101994_189965 asked Erick Ramirez edited

What is the default num_tokens (vnodes) in Cassandra?

Hi,

while going through DSE201 I found default vnode size as 128 being asked in quiz as a question but i think it has changed to 256 right ? because quiz supports 128 as correct answer

cassandrads201configurationvirtual nodes
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered satvantsingh_190085 commented

The "default" number of tokens depends on which aspect of the configuration you're looking at and is something I'll try to clarify.

Apache Cassandra

In open-source Cassandra, num_tokens is explicitly set to 256 in cassandra.yaml. This is true for both the latest C* 3.11.6 release and the current alpha release C* 4.0-alpha4:

num_tokens: 256

However if num_tokens is not set in cassandra.yaml, the number of tokens defaults to just 1 ( see Config.java):

    public int num_tokens = 1;

DataStax Enterprise

The default cassandra.yaml that ships with DSE (for example, DSE 6.0) has num_tokens set to 128 but it is commented out so it has no effect during bootstrap:

# num_tokens: 128

Unless num_tokens is explicitly set, a new node will bootstrap with just 1 token.

DS201 course

Specifically on your query about the Academy exercise on vnodes, the default value hasn't changed.

The quiz was asking about the default in the DSE 6.0.0 cassandra.yaml -- the version included in the VM image you downloaded for the course. And it was commented out with 128:

# num_tokens: 128

For the purposes of the DS201 course, 128 tokens was the appropriate answer. But I understand the confusion since if you install DSE without explicitly setting num_tokens, the node will bootstrap with just 1 token since the default configuration in DSE's Config.java is also 1 token:

    public int num_tokens = 1;

Other info

Note that there are discussions in the C* devs' mailing list and CASSANDRA-15521 to update the defaults in the near future. Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

saravanan.chinnachamy_185977 avatar image
saravanan.chinnachamy_185977 answered saravanan.chinnachamy_185977 edited

@pranali.khanna101994_189965 Congratulations on your effort to complete DSE201 course on DataStax Academy. Appreciate the fact that you noticed and asked a good question.

As for as the number of vnodes recommended, it is evolving continually as we strive to improve the performance of Cassandra cluster. The DataStax recommendation has changed after we made the course.

When the initial work on vnodes in Cassandra were started, 256/128 tokens were chosen in order to ensure even distribution of data. Unfortunately, there were several unforeseen (and unpredictable) consequences of choosing a number this high(either 256 or 128). High vnodes caused issues with bootstrapping new nodes (lots of SSTables), longer repair times, and overall higher CPU usage.

At the moment a new token allocation algorithm is being used, which efficiently balances the workload using fewer tokens. The allocation algorithm attempts to choose tokens in a way that optimizes replicated load over the nodes in the datacenter for the specified RF.

DataStax recommends using 8 vnodes (tokens).

  • When adding a vnode to an existing cluster or setting up nodes in a new datacenter, set the target replication factor (RF) of keyspaces in the datacenter with the allocate_tokens_for_local_replication_factor option.
  • The allocation algorithm distributes the token ranges proportionately using the num_tokens settings.

Please refer to the following documentations for more details.

Virtual node recommendation

New token allocation algorithm in Cassandra 3.0


Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.