DickChai avatar image
DickChai asked Erick Ramirez edited

Deleting files in Cassandra data directory


I am running the open source Cassandra. I ran out of space in the data directory that stores the cassandra database. When I display the data directory and sort the files displayed by date, I see files that have names ending with Data.db, Digest.crc32, Filter.db, Index.db, Summary.db, CompressionInfo.db, Statistics.db, TOC.txt. Can I delete these files that have the same date?

Please help.

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

steve.lacerda avatar image
steve.lacerda answered steve.lacerda commented

Those are your data files, so if you delete those you may lose data (depending on your replication factor). I would look for things like snapshots and backups, instead of touching the database files. In the same location as those Data.db, Digest, etc... you may see a snapshots directory. If there are snapshots inside that directory, then that's where I'd start in order to free up space. There may also be a backup directory that might have some backups that you can clear out.

Also, I would contact DataStax support if you require help with this because it sounds like you may be in a data loss type of scenario depending on how you handle the matter and your replication factors.

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

DickChai avatar image DickChai commented ·
Hello Steve.lacerda,

Thank you for you response. I do not a snapshots directory and the backups directory is empty. I am not concerned about data loss. After all, the files that I want to delete are from 05/2019-09/2019.. My concern is that if I delete the files, then Cassandra will not work. As long as Cassandra still works after I delete the files, then I am ok.

By the way, I had contacted DataStax support and the support person asked me to pose this question on the community forum.

I look forward to your feedback. Thank you.

0 Likes 0 ·
steve.lacerda avatar image steve.lacerda ♦ DickChai commented ·

If you're not concerned about data loss, then go ahead and shut the cassandra service down, delete the files (all of them, Data, TOC, Index, etc), and then restart the cassandra service. The only problem with this is, that if you run repairs, that data will repair and come right back. If you do run repairs, you'll need to do this on all nodes if you indeed want this data gone. Also, in the future, I would recommend adding a default time to live on the table if you don't want to get back into this scenario.

0 Likes 0 ·