arunkolluri89_98476 avatar image
arunkolluri89_98476 asked Erick Ramirez commented

How should I proceed deleting 3 million records?

I have to delete 3 million records to free up space .How should i proceed .Should i do a batch of 10000 records first and weight for the gc_grace period to finish so that it will clean up the tombstone or run run manual compaction to remove the tombstones immediately .

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

There's no ideal way to handle mass deletions in Cassandra.

If you're deleting a range of rows within partitions, this is going to be problematic since reading partitions will require iterating over the deleted rows. But if you are deleting whole partitions, it's not a real concern since it won't suffer from the same problem as deleted rows.

Our general recommendation is to stretch out the deletions so the tombstones don't put pressure on your cluster. Break the deletions up over a long period of time as much as possible.

Be aware that running manual compactions has its own problems and is only relevant for compaction strategies like SizeTieredCompactionStrategy. I've written about this problem previously in Why forcing a major compaction is not ideal.

It's also important to note that mass deletions on tables using LeveledCompactionStrategy can result in an IO storm since LCS will aggressively merge partition fragments together into SSTables. Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

arunkolluri89_98476 avatar image arunkolluri89_98476 commented ·

Thanks for the help @Erick Ramirez.I will stretch out the deletions by deleting them in small batches .

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ arunkolluri89_98476 commented ·

Good luck. Cheers!

0 Likes 0 ·