DataStax Academy FAQ

DataStax Academy migrated to a new learning management system (LMS) in July 2020. We are also moving to a new Cassandra Certification process so there are changes to exam bookings, voucher system and issuing of certificates.

Check out the Academy FAQ pages for answers to your questions:


question

vranganathan avatar image
vranganathan asked ·

nodetool load does not seem to refresh

BEFORE

UN  172.x.x.x  186.44 TB  256          ?       f2f425ce-1059-4ef3-b1d4-e095701a9439  us-east-1e

-------

*Restart the node*

-------

AFTER

UN  172.x.x.x  231.59 GB  256          ?       f2f425ce-1059-4ef3-b1d4-e095701a9439  us-east-1e


How is the load on a host decided? Official documentation says

The amount of file system data under the cassandra data directory after excluding all content in the snapshots subdirectories. Because all SSTable data files are included, any data that is not cleaned up, such as TTL-expired cell or tombstoned data) is counted.


This number (186.44 TB) did not come down after a "nodetool cleanup" / after the node is done with all the compactions.


This only came down after a restart. Why does this happen?

nodetoolapache cassandra3.0.9load
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

aholmber avatar image
aholmber answered ·

Did you notice that those numbers have different units? It looks like it has gone down three orders of magnitude from hundreds of terabytes to a couple hundred gigabytes.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

vranganathan avatar image
vranganathan answered ·

@aholmber Yes, I am aware of it and that is exactly my question. How come restarting a node clears up so much load ?

Little more context:
I did a nodetool cleanup and waited for all the active compactions to finish. After doing this, I waited for 3 hrs and then did this comparison. (Ran nodetool status before restart and immediately after restart)

This makes me think that the output nodetool status gives out might be flawed. There was no data loss, everything seems just perfect except this huge reduction in the `load`.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

aholmber avatar image
aholmber answered ·

I see I misunderstood what was being asked. You're specifically wondering why the cleanup is not reflected until after the node is restarted.


I don't know if this is by design, but it does seem surprising to me. If it's reproducible, they might appreciate a ticket on the Apache Cassandra (TM) Jira.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.