Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

pranali.khanna101994_189965 avatar image
pranali.khanna101994_189965 asked ·

Why does nodetool getendpoints return multiple IP addresses?

I have a cluster with 3 nodes and RF=2 . I have inserted one record and when i am executing

nodetool getendpoints keyspacename tablename parttitonkey/tokenvalue of the specific parition;

it gives me 2 IP address which is because RF=2. does that mean this token also belongs to 2 different nodes ?

cqlsh:killrvideo> select token(tag),tag from videos_by_tag;

 system.token(tag)   | tag
----------------------+-----------
-1651127669401031945 | datastax
-1651127669401031945 | datastax
356242581507269238   | cassandra
356242581507269238   | cassandra
356242581507269238   | cassandra
$ nodetool getendpoints killrvideo videos_by_tag -1651127669401031945
10.61.27.142
10.61.27.226

which means token -1651127669401031945 (partitonkey token) also gets replicated along with the Data?

I am not able to understand that this token is on both nodes then ?

replication
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered ·

With a replication factor of 2, you've instructed Cassandra to keep 2 copies of the data so:

  • the first copy is held on the node that owns the token range it belongs to
  • a second copy is held on the "neighbour" which is the node immediately to the right of the ring of the primary replica owner

When you run nodetool getendpoints, it will return the primary owner plus the neighbouring nodes in the ring. The number of neighbours returned will depend on the replication factor of the keyspace you're interested in.

The second or third replica do not "own" the token you requested -- it's just that they hold the copies for the adjacent token range.

Let me illustrate with this example diagram:

In the example, the data has a token value of 59 and the keyspace has a replication factor of 3. The purple node is the primary owner of the range so it has the first copy of the data.

With RF=3, it means that there are 3 total copies of the data so it is replicated to 2 additional neighbouring nodes. In this case, the red node is the second replica since it is immediately to the "right" of the purple node, and the blue node is the third replica since it is the next node around the ring.

To be clear, the token is not replicated to other nodes -- it is the partitions (data) that are replicated (holds copies) to other nodes.

DataStax Academy has an 8-minute video on Cassandra Replication you might be interested in. It will help clarify the concepts for you in a few minutes. Cheers!


replicas.png (447.1 KiB)
Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

bettina.swynnerton avatar image
bettina.swynnerton answered ·

nodetool getendpoints provides the IP addresses of replicas that own the partition key.

A consistent hashing algorithm maps the partition key to a Cassandra node, and only one node.

The other owners of this data are determined by the replication strategy and their position in the token ring. The partition key itself is part of your data, and this is replicated with your data.

There is only one node that owns the token that corresponds to the partition key.

The syntax for the command is, see here:

nodetool getendpoints <keyspace> <table> <key>

The command will return the endpoints for any partition key, wether you have data for this key in your table or not.

To me it looks that in your example, the returned endpoints are for a partition key of value "-1651127669401031945", as there is no option to specify the token value in the command.

I hope this makes it clear, let me know if you have further questions.

4 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks for you answer @bettina.swynnerton .

I tried executing the below command like :

nodetool getendpoints killrvideo videos_by_tag datastax


then also I get the same output :


10.61.27.142

10.61.27.226


which means this partiton key is on both nodes right ? which means token value of 'datastax' which is -1651127669401031945 is also copied into replica node .


but as you said it will always hash to -1651127669401031945 for datastax which will be on a single node right ? and when that node is down it will fetch from other replica node right ?


and that time the token -1651127669401031945 is replicated on that node but not part of the token range of replica node right ?

0 Likes 0 · ·
bettina.swynnerton avatar image bettina.swynnerton ♦♦ pranali.khanna101994_189965 ·

Hi @pranali.khanna101994_189965,

I think you misunderstand the idea of the token value. The token value is calculated from the partition key, it is not stored with the data, and it is not replicated. Only one node has this token. The hashing algorithm resolves any partition key to exactly one token, which belongs to exactly one node. This algorithm works whether there is data for this partition or not.

The replication strategy and the token ring then determine which other nodes have a replica of the data for the partition key. Again, this can be uniquely determined without any data in the table.

0 Likes 0 · ·
pranali.khanna101994_189965 avatar image pranali.khanna101994_189965 bettina.swynnerton ♦♦ ·

Hi @bettina.swynnerton I understood that token value is not replicated but the partition key along with the data is replicated right ?


in my case partition key 'datastax' will be replicated along with the row right? as RF=2. so when the node is down which owns hash of this key (datastax = -1651127669401031945 ) so query will be sent to the replica node for result fetching right ?

0 Likes 0 · ·
Show more comments