Hi,
while going through DSE201 I found default vnode size as 128 being asked in quiz as a question but i think it has changed to 256 right ? because quiz supports 128 as correct answer
Hi,
while going through DSE201 I found default vnode size as 128 being asked in quiz as a question but i think it has changed to 256 right ? because quiz supports 128 as correct answer
The "default" number of tokens depends on which aspect of the configuration you're looking at and is something I'll try to clarify.
In open-source Cassandra, num_tokens
is explicitly set to 256 in cassandra.yaml
. This is true for both the latest C* 3.11.6 release and the current alpha release C* 4.0-alpha4:
num_tokens: 256
However if num_tokens
is not set in cassandra.yaml
, the number of tokens defaults to just 1 ( see Config.java):
public int num_tokens = 1;
The default cassandra.yaml
that ships with DSE (for example, DSE 6.0) has num_tokens
set to 128 but it is commented out so it has no effect during bootstrap:
# num_tokens: 128
Unless num_tokens
is explicitly set, a new node will bootstrap with just 1 token.
Specifically on your query about the Academy exercise on vnodes, the default value hasn't changed.
The quiz was asking about the default in the DSE 6.0.0 cassandra.yaml
-- the version included in the VM image you downloaded for the course. And it was commented out with 128:
# num_tokens: 128
For the purposes of the DS201 course, 128 tokens was the appropriate answer. But I understand the confusion since if you install DSE without explicitly setting num_tokens
, the node will bootstrap with just 1 token since the default configuration in DSE's Config.java
is also 1 token:
public int num_tokens = 1;
Note that there are discussions in the C* devs' mailing list and CASSANDRA-15521 to update the defaults in the near future. Cheers!
@pranali.khanna101994_189965 Congratulations on your effort to complete DSE201 course on DataStax Academy. Appreciate the fact that you noticed and asked a good question.
As for as the number of vnodes recommended, it is evolving continually as we strive to improve the performance of Cassandra cluster. The DataStax recommendation has changed after we made the course.
When the initial work on vnodes in Cassandra were started, 256/128 tokens were chosen in order to ensure even distribution of data. Unfortunately, there were several unforeseen (and unpredictable) consequences of choosing a number this high(either 256 or 128). High vnodes caused issues with bootstrapping new nodes (lots of SSTables), longer repair times, and overall higher CPU usage.
At the moment a new token allocation algorithm is being used, which efficiently balances the workload using fewer tokens. The allocation algorithm attempts to choose tokens in a way that optimizes replicated load over the nodes in the datacenter for the specified RF.
DataStax recommends using 8 vnodes (tokens).
Please refer to the following documentations for more details.
New token allocation algorithm in Cassandra 3.0
6 People are following this question.
How do I change the location of the data directory on a running cluster?
Which network properties should I configure in cassandra.yaml, private vs public IP?
Adding a new node always reports "Unable to gossip with any peers"
Upgraded to DSE 6.8.0, getting "Error opening zip file or JAR manifest missing : lib/jamm-0.3.2.jar"
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2023 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
Privacy Policy Terms of Use