Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

anshita333saxena_187432 avatar image
anshita333saxena_187432 asked Erick Ramirez commented

Why is the number of records returned different with CL LOCAL_ONE, LOCAL_QUORUM and ALL?

[FOLLOW UP QUESTION TO #5138]

Yesterday I ran a small test around reading the records from the table using different consistency levels:

Consistency level Number of records
local_one 34517876
local_quorum 34546294
All 34546533

With the help of this result, we saw that ALL gave the more number of records. Is this because of replication not consistent across all the nodes? Can you please suggest...

cassandrareplication
1 comment
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

@Erick Ramirez is this the expected behavior? Is this happened because of the data not consistent across the nodes (like replication issue)?

0 Likes 0 ·

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

The varying number of records indicate that some of the replicas have missed the writes which means that there would have been hints stored on the coordinator when you performed the bulk load.

Since you have 2 data centres, both LOCAL_ONE and LOCAL_QUORUM reads will only return results from the local DC you queried. Reading with ALL will request the data from all nodes in all DCs.

Ordinarily, you will need to repair the problematic table on all nodes (with a rolling nodetool repair -pr) but since you already performed a read with consistency ALL, it would have triggered a read-repair and would have repaired all the partitions you queried. Cheers!

4 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

@Erick Ramirez the cluster in which I ran this experiment is only having single DC. I still need to try this on cross DCs.

However, yes by reading with ALL consistency level, it triggered the automatic repair... So @Erick Ramirez then we do not have to trigger repair by `nodetool repair -pr` as it is can be done via our spark-cassandra-connector ALL consistency level?

0 Likes 0 ·

I take back what I said about read-repair getting triggered. If I remember, you're using a different distribution of Cassandra (AWS Keyspaces? Scylla DB?) so I don't know how your cluster behaves because those DBs have their own implementations of Cassandra. For example, AWS Keyspaces isn't a "true" Cassandra DB under the hood -- I don't know for sure because public docs are limited but it has a CQL API engine in front of possibly a Dynamo DB backend so you can query Dynamo with CQL.

But yes, if the nodes were overloaded with your bulk load then there's a good chance they'd be inconsistent and will need to be repaired. Cheers!

0 Likes 0 ·

Thanks a lot for your responses/help/directions Erick.

0 Likes 0 ·
Show more comments