Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

satvantsingh_190085 avatar image
satvantsingh_190085 asked ·

How does read work in Cassandra?

How select query know from which node to fetch data ? Does read also goes through partitioner and generate hash token ?

cassandraread
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered ·

The partitioner determines where partitions are stored in a ring (data centre). The partition key in a read request gets converted to a token by the partitioner using a consistent hashing algorithm. The default partitioner is the Murmur3Partitioner.

For more info, see the Cassandra Partitioners. You might also be interested in the Cassandra read path. Cheers!

1 comment Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

saravanan.chinnachamy_185977 avatar image
saravanan.chinnachamy_185977 answered ·

@satvantsingh_190085 Yes the read path also goes through partitioner to determine which nodes have the data. But the actual read involves many more steps and I will provide some more references for a detailed explanation of the actual steps involved.

A client can connect to any node in the cluster to perform reads, without having to know whether a particular node acts as a replica for that data. If a client connects to a node that doesn’t have the data it’s trying to read, the node it’s connected to will act as coordinator node to read the data from a node that does have it, identified by token ranges. A partitioner determines how data is distributed across the nodes in the cluster.

Cassandra processes data at several stages on the read path to discover where the data is stored, starting with the data in the memtable and finishing with SSTables.

The read path begins when a client initiates a read query to the coordinator node. the coordinator uses the partitioner to determine the replicas and checks that there are enough replicas up to satisfy the requested consistency level. If the coordinator is not itself a replica, the coordinator then sends a read request to the fastest replica, as determined by the snitch. The coordinator node also sends a digest request to the other replicas. A digest request is similar to a standard read request, except the replicas return a digest, or hash, of the requested data.

For a detailed explanation of all the steps that occurs in the read path, please refer to

Read Path

You can also refer to DataStax academy for a visual explanation of the read path at

DataStax Academy

1 comment Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.