DataStax Academy FAQ

DataStax Academy migrated to a new learning management system (LMS) in July 2020. We are also moving to a new Cassandra Certification process so there are changes to exam bookings, voucher system and issuing of certificates.

Check out the Academy FAQ pages for answers to your questions:


question

phofegger_148429 avatar image
phofegger_148429 asked ·

How do I avoid SSTables from growing too large?

According the cassandra migration (see qestions: https://community.datastax.com/questions/3042/migrate-cassandra-to-new-service-provider-and-redu.html?childToView=3074#answer-3074 current at step 7) i have seen that i have much growing sstables > 500GB after the datasync from a existing DC. On the basis of the big sstable i will get trouble with compactions and nodetool garbagecollection because of free diskspace. I have to add a third virtual DC (DC2new) to the existing C* cluster and I would like to avoid much growing sstables. Here my questions:

a) Is it possible to set a max size for sstable for the datasync?

b) On the existing DC I would like to split some sstables in smaller pieces like 100GB or Max 150GB. Is this possible and make sense?

Many thanks in advance.

cheers

patrick

cassandracompactionstcs
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

For nodes to have a single SSTable larger than 500GB indicates that your cluster has super-dense nodes which can be problematic on its own.

In any case, I don't believe a standard nodetool rebuild would generate large SSTables. My guess is that you are manually running a major compaction (with nodetool compact) as a workaround for a tombstone problem. This workaround is an issue in itself because it only hides an underlying data model problem, one that I've previously written about in Why forcing major compaction on a table is not ideal.

If you must run a major compaction, you should run it with the "split output" flag (--split-output or -s for short) so you don't end up with one single giant SSTable. It was an enhancement implemented in Cassandra 2.2 (CASSANDRA-7272) which splits the output of nodetool compact into multiple files which are 50% then 25% then 12.5% of the original table size and so on until the smallest chunk is 50MB for tables using STCS.

If you would like to break up the large SSTables on the nodes, you can do so using the sstablesplit utility. I have previously documented a workaround which does not require downtime and involves copying the large SSTable to another server which is not part of the cluster. The high level steps are:

  1. Copy a single SSTable generation and its components to another server where Cassandra is installed (but not running).
  2. Run the sstablesplit utility on the SSTable.
  3. Copy the output files back to the source node.
  4. Temporarily shutdown C*.
  5. Move out the original [problematic] large SSTable out to another directory.
  6. Start C*.

For the detailed steps, see How to split large SSTables on another server. Cheers!

4 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi Erick, thank you for the answer.

I use garbagecollect for deleting tombstones `e.g. /nodetool garbagecollect -j 1 keyspace01 table01`. I tried to avoid making major compaction I think the autocompaction after the rebuild created this big sstables. Is it possible to set a max size of sstable to avoid creating big sstables through compaction? I have seen it is possible to do a major compaction on a single sstable but not with the property --split-output is that right?

Is it possible to split sstables on the same node. My plan would be

  • stop cassandra
  • start the sstablesplit with one sstable and split it in 100GB pieces. After the split I think the original sstable are exist.
  • move the original sstable (whole set) to another /directory (as a backup)
  • start Cassandra again.

Is this plan possible ? Thanks.

cheers, Patrick

0 Likes 0 · ·

I tested sstablesplit on a cassandra DEV environment.

sstable split on the same node

  • stopped cassandra /appl/data/split/<keyspace>/<table>/
  • copied a sstable with whole set to /appl/data/split/-
/appl/cassandra/tools/bin/sstablesplit --debug -s 100 --no-snapshot /appl/data/split/mc-2289-big-Data.db

I got following message

Exception in thread "main" java.lang.AssertionError: Unknown keyspace data

i changed the directory structure to /appl/data/split/<keyspace>/<table>/

/appl/cassandra/tools/bin/sstablesplit --debug -s 100 --no-snapshot /appl/data/split/<keyspace>/<table>/*

then it works, but the ouput files was written to the original datapath.

removed the old sstable

started cassandra again -> up & running -> cassandra started a compaction and the new sstable was bigger than before :-)

How can I prevent to compact the new sstable files.?

0 Likes 0 · ·
How can I prevent to compact the new sstable files.?

You can't prevent it. Compaction is part of Cassandra's normal operation.

Is the table configured with TWCS? If it is, the SSTables will get compacted with STCS into one SSTable during the first "window". That's expected behaviour.

0 Likes 0 · ·
Is it possible to set a max size of sstable to avoid creating big sstables through compaction?

No, it isn't possible to set a maximum size.

0 Likes 0 · ·