From today's Q and A session, 08/28/2020, I understood that SSTables can be removed even with the cluster running. Did I get it right? That is really cool for time series data.
So, there is no metadata tracking SSTables in Cassandra?
From today's Q and A session, 08/28/2020, I understood that SSTables can be removed even with the cluster running. Did I get it right? That is really cool for time series data.
So, there is no metadata tracking SSTables in Cassandra?
Yes, Cassandra keeps track of the SSTables it manages. I feel that you've taken Patrick McFadin's comments out of context so allow me to explain.
Cassandra keeps track of all SSTables it owns. You will see on startup in the debug.log
that C* opens all the files so it knows what data is on disk including the partition index, partition summaries, etc.
For the purposes of time series data, we recommend you use TimeWindowCompactionStrategy
(TWCS) with a TTL on the data. When all the data in an SSTable (time window) is fully-expired, all TWCS does is drop (delete) it from the filesystem. Cheers!
5 People are following this question.
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2023 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
Privacy Policy Terms of Use