Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started



anshulsaini avatar image
anshulsaini asked Erick Ramirez commented

Why does one of three nodes with RF=3 show more disk usage?

Topology - 3 nodes, 2 DC, RF=3


C* version 2.1

1 out of 3 node on both DC, shows more disk usage (> 50%) than the other 2.

As per my understanding the data should be distribute equally across the nodes as the RF is 3 and we have only 3 nodes in one DC.

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

You are correct in that each node should have a full copy of the data when each DC has 3 nodes and RF=3. However, you haven't provided enough information for us to work out what the underlying problem is.

You need to verify that your identification of "shows more disk usage" is correct. If you're comparing disk space at the filesystem layer, it isn't necessarily a good check since you need to verify for the existence of snapshots. Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks for your response. Yes, the analysis is based on disk space usage at filesystem level on nodes.

Have suggested user to run below commands on effected nodes:

nodetool cleanup

nodetool scrub

nodetool clearsnapshot

Is there any other check we can do to ensure that space taken by CFs is same across the nodes OR any other housekeeping activity that can be taken up.

0 Likes 0 ·

You need to verify at the filesystem level what is occupying the space and then check whether those files/subdirectories should be there. Cheers!

0 Likes 0 ·