Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

pranali.khanna101994_189965 avatar image
pranali.khanna101994_189965 asked ·

Isn't data supposed to be written to where the partition key is hashed?

I read that writes are wriiten to any node in the cluster. aren't they supposed to be written where the partition key is hashed so that retrieval is efficent ? again how replication ensures efficient retrival if same data is found on multiple nodes ? Lots of confusion !!

I took reference from here :

https://docs.datastax.com/en/archived/cassandra/2.1/cassandra/architecture/architectureDataDistributeHashing_c.html

Pls suggest

cassandrareplication
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

alex.ott avatar image
alex.ott answered ·

There are 2 things:

  1. who accepts write or read request from a client - this is so-called coordinating node. It could be any node in the cluster
  2. who actually store the data - it's a replica that is detected based on hash of the value of partition key (token)

To make queries efficient, 1 & 2 should be the same - so it's a job for a client to send request to the coordinating node that is also a replica. Inside a driver this is a job for load balancing policy - by default it's a "token-aware" - this means that it's able to calculate a token and identify the replicas to which query need to be sent (please note that in drivers, token aware routing happens only for prepared queries). You can read more about token-aware routing in Java driver documentation or in the developing with drivers guide.

2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks !! but if same data is replicated to one or more nodes that means 1 partition key exists in one or more node so in that case when we say we store on basis of hash value how does that happen?

0 Likes 0 · ·
smadhavan avatar image smadhavan pranali.khanna101994_189965 ·

@pranali.khanna101994_189965, here are couple additional resources to help you understand the token ranges and its allocation algorithms,

0 Likes 0 · ·