Looking for advice on this one, we have a number of redundant data.db and other files that are outwith the TTL period taking up excess space, what would be the safest way to remove these without causing disruption to the main db itself
Bringing together the Apache Cassandra experts from the community and DataStax.
Want to learn? Have a question? Want to share your expertise? You are in the right place!
Not sure where to begin? Getting Started
Looking for advice on this one, we have a number of redundant data.db and other files that are outwith the TTL period taking up excess space, what would be the safest way to remove these without causing disruption to the main db itself
Generally speaking the older data files should be compacted away. If they are still hanging around chances are they still have live data and / or are not eligible for compaction. Perhaps you could include your schema for this table in your questions to facilitate a more directed answer?
@graham.robertson_178283 SSTables are not necessarily redundant despite being older than the table's default TTL. There are various reasons why data files hang around. As an example, SizeTieredCompactionStrategy
will only compact similarly-sized SSTables (4 by default) together. Generally, older SSTables are larger and takes a while before they end up with compaction "partners".
We recommend that you let normal compaction operations to deal with the SSTables and that you don't manually delete them from the filesystem. If you're dealing with a really large STCS SSTable and you don't want to wait (server is running out of space, for example), you might be interested in this KB article I wrote -- FAQ - How to split large SSTables on another server. Cheers!
3 People are following this question.
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2021 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
Privacy Policy Terms of Use