Say I have 2 nodes and a table that has 5 partitions how would that be distributed across nodes... If the number of nodes is = number of partitions then that's understandable but what if the number of partitions> number of nodes.
Say I have 2 nodes and a table that has 5 partitions how would that be distributed across nodes... If the number of nodes is = number of partitions then that's understandable but what if the number of partitions> number of nodes.
The number of partitions doesn't have any relationship with the number of nodes in a cluster.
Each node in the cluster is responsible for a range of tokens that is a subset of data in the "ring" of tokens. In the case of virtual nodes (num_tokens
), each node owns multiple token ranges.
Cassandra's partitioner uses a hashing algorithm (the default Murmur3Partitioner
uses the MurmurHash3 function) to convert the partition keys to a token value. Each partition in a table gets distributed around the ring of nodes in each DC based on its equivalent token value derived by the partitioner.
To illustrate how Cassandra determines where partitions are stored, let me use this example 3-node cluster:
$ nodetool ring community Datacenter: Cassandra ========== Address Rack Status State Load Owns Token 3074457345618258602 10.101.33.146 rack1 Up Normal 174.98 KiB 33.33% -9223372036854775808 10.101.33.196 rack1 Up Normal 164.12 KiB 33.33% -3074457345618258603 10.101.36.82 rack1 Up Normal 200.86 KiB 33.33% 3074457345618258602
From this example, here are the range of tokens owned by each node:
10.101.36.82 : -3074457345618258602 to 3074457345618258602 10.101.33.146 : 3074457345618258603 to -9223372036854775808 10.101.33.196 : -9223372036854775807 to -3074457345618258603
As you can see from the table above, each node owns a large number of tokens. When you add or remove nodes from the cluster, the range of tokens owned by each node will change.
Using an example partition key pk = 'Ramirez'
, the Murmur3 hash function returns the token value as:
Murmur3Partitioner('Ramirez') = -4404190778401486753
This means that the partition with pk = 'Ramirez'
belongs to node 10.101.33.196
since the token for "Ramirez" (-4404190778401486753
) is in the range owned by node 10.101.33.196
in the table above.
When a coordinator gets a write request for a partition where pk = 'Ramirez'
, the coordinator will forward the write to node 10.101.33.196
. Similarly, when a coordinator gets a read request for the partition where pk = 'Ramirez'
, the data will be retrieved from node 10.101.33.196
.
For more information, see Data distribution and replication. Cheers!
6 People are following this question.
Why does nodetool getendpoints return results for non-existent partitions?
Why does nodetool getendpoints return the same IP for 2 different partitions?
Will the data in two tables be in different physical partitions if they have the same partition key?
How is data distributed and replicated in a cluster?
Is it possible to define a custom partitioner for 1 table in the cluster?
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2023 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
Privacy Policy Terms of Use