PLANNED MAINTENANCE

Hello, DataStax Community!

We want to make you aware of a few operational updates which will be carried out on the site. We are working hard to streamline the login process to integrate with other DataStax resources. As such, you will soon be prompted to update your password. Please note that your username will remain the same.

As we work to improve your user experience, please be aware that login to the DataStax Community will be unavailable for a few hours on:

  • Wednesday, July 15 16:00 PDT | 19:00 EDT | 20:00 BRT
  • Thursday, July 16 00:00 BST | 01:00 CEST | 04:30 IST | 07:00 CST | 09:00 AEST

For more info, check out the FAQ page. Thank you for being a valued member of our community.


question

anshita333saxena_187432 avatar image
anshita333saxena_187432 asked ·

Why is the number of records returned different with CL LOCAL_ONE, LOCAL_QUORUM and ALL?

[FOLLOW UP QUESTION TO #5138]

Yesterday I ran a small test around reading the records from the table using different consistency levels:

Consistency level Number of records
local_one 34517876
local_quorum 34546294
All 34546533

With the help of this result, we saw that ALL gave the more number of records. Is this because of replication not consistent across all the nodes? Can you please suggest...

cassandrareplication
1 comment
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

@Erick Ramirez is this the expected behavior? Is this happened because of the data not consistent across the nodes (like replication issue)?

0 Likes 0 · ·

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

The varying number of records indicate that some of the replicas have missed the writes which means that there would have been hints stored on the coordinator when you performed the bulk load.

Since you have 2 data centres, both LOCAL_ONE and LOCAL_QUORUM reads will only return results from the local DC you queried. Reading with ALL will request the data from all nodes in all DCs.

Ordinarily, you will need to repair the problematic table on all nodes (with a rolling nodetool repair -pr) but since you already performed a read with consistency ALL, it would have triggered a read-repair and would have repaired all the partitions you queried. Cheers!

4 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

@Erick Ramirez the cluster in which I ran this experiment is only having single DC. I still need to try this on cross DCs.

However, yes by reading with ALL consistency level, it triggered the automatic repair... So @Erick Ramirez then we do not have to trigger repair by `nodetool repair -pr` as it is can be done via our spark-cassandra-connector ALL consistency level?

0 Likes 0 · ·
Erick Ramirez avatar image Erick Ramirez ♦♦ anshita333saxena_187432 ·

I take back what I said about read-repair getting triggered. If I remember, you're using a different distribution of Cassandra (AWS Keyspaces? Scylla DB?) so I don't know how your cluster behaves because those DBs have their own implementations of Cassandra. For example, AWS Keyspaces isn't a "true" Cassandra DB under the hood -- I don't know for sure because public docs are limited but it has a CQL API engine in front of possibly a Dynamo DB backend so you can query Dynamo with CQL.

But yes, if the nodes were overloaded with your bulk load then there's a good chance they'd be inconsistent and will need to be repaired. Cheers!

0 Likes 0 · ·

Thanks a lot for your responses/help/directions Erick.

0 Likes 0 · ·
Show more comments