question

Sun_P avatar image
Sun_P asked Erick Ramirez commented

Is it necessary to run repairs before running nodetool garbagecollect?

If we are running repair on regular basis (one complete cycle within one gc_grace_seconds) then is it necessary to run repair before running major compaction or garbagecollector on the node?

1. Run repair and followed by compaction/garbagecolletor on all nodes.

2. compaction/garbagecollector on all nodes and run repair.

I feel first approach is more reliable as it will ensure data consistency before compaction/garbagecollector , is it right?

If yes then do we need to repair again after compaction/garbagecollector ?

tombstones
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

I'm aware that you've been discussing this with Jeremiah "JD" Jordan on ASF Slack so my responses here will be identical to what JD has already told you. For what it's worth, there is no "garbagecollector" in Cassandra so I'm assuming you meant the nodetool garbagecollect command.

As JD explained to you, running repairs before you perform the operations is a safeguard to make sure that all replicas have received all the deletes. If a replica did not receive a delete for whatever reason and you force a major compaction, the risk is that it will resurrect zombie data because the tombstone for the delete doesn't exist.

If you are sure that the regular scheduled repair tasks have completed successfully then you don't need to run repairs beforehand. But only an operator can make that decision because it assumes that you've verified that the repairs have completed successfully.

On your second point, I'm not sure where you got that idea from because running repairs after makes no difference as I've explained above.

I realise you didn't ask this but you mentioned on ASF Slack about reducing gc_grace_seconds to a very low number. Just be aware that it's a dangerous thing to do. The lowest GC grace you can set should be determined by how long it will take you to replace a node in your cluster.

If a node goes down for whatever reason (hardware failure, overload/unresponsive, etc) and GC grace has elapsed, you can no longer restart Cassandra on that node because it won't have received deletes so when it joins the cluster after being down/unresponsive, deleted data will get resurrected. You should never set GC grace to a low number unless you have expertise and understand the implications.

You should also know that the garbagecollect command has limitations and won't always be able to remove tombstones. And if you haven't seen it already, I wrote this post on Why forcing a major compaction isn't ideal. It can create other problems for you so you really need to focus on fixing your data model since all of these are just band-aid workarounds to a bigger underlying problem in your cluster. Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Sun_P avatar image Sun_P commented ·

I was cleared with JD's point, but afterwards got little confused between nodetool garbagecollector and repair co-relation. But it makes clear now. Thanks Erick for making it so clear now.

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ Sun_P commented ·
Not a problem. Cheers!
0 Likes 0 ·