DataStax Academy FAQ

DataStax Academy migrated to a new learning management system (LMS) in July 2020. We are also moving to a new Cassandra Certification process so there are changes to exam bookings, voucher system and issuing of certificates.

Check out the Academy FAQ pages for answers to your questions:


question

pranali.khanna101994_189965 avatar image
pranali.khanna101994_189965 asked ·

What happens to the rest of SSTables when the max_threshold is reached for STCS?

in STCS , whenever a bucket reaches its max_threshold value the SSTABLES are trimmed to by default 32. what happens to the rest of the other tables are they dropped? is data lost

cassandracompactionstcs
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

bettina.swynnerton avatar image
bettina.swynnerton answered ·

Hi @pranali.khanna101994_189965,

Size tiered compaction merges sets of SSTables that are approximately the same size. Casssandra compares each SSTable size to the average of all SSTable sizes on the node. It merges SSTables whose sizes in KB are within [average-size × bucket_low] and [average-size × bucket_high].

The subproperties min_threshold and max_threshold control how a minor compaction would be triggered and how many sstables can be merged as part of one minor compaction.

The minor compaction is triggered when the min_threshold of similarly sized SSTables is met. By default that is 4.

Should you have more than max_threshold number of SSTables of similar size, a first compaction would merge no more than max_threshold number of SSTables together. If after that compaction the min_threshold is still met (i.e. you have more SSTables of similar size than min_threshold), the next compaction would merge the next set, up to a maximum of max_threshold.

No data is lost.

I can't see many situations where the default max_threshold of 32 would be hit, unless automatic compaction was disabled for a longer period of time, allowing a significant number of SSTables to build up; or the compaction strategy was changed. In those cases you want to avoid that too many SSTables are merged together into a very large SSTable, and the max_threshold allows you to set an upper limit on the number that can be merged. In most cases, there is no need to change these default settings.

I hope this clarifies how the minor compaction is triggered (by the min_threshold ) and what is defined by the max_threshold setting.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.