Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

bharadhwajt avatar image
bharadhwajt asked ·

Why is OpsCenter repair running too frequently?

Hi,

I have a cluster its repairing too frequently ( within Mins) as the data is very less.

Its starting repair cycle every one min.

First I changed below to 100

time_to_completion_target_percentage = 100

But the time has only increased a bit where its completing in 20 mins now. Is there a way we can control where the repair should run 4 or 5times in a day?

Later I tried with by setting min_repair_time = 7200 to see if it repairs in 2 hours ( assuming min_repair_time spans total repair time to 2 hours

(7200 sec)

max_parallel_repairs = 9

max_pending_repairs = 90

min_repair_time = 7200

single_repair_timeout = 36000

tokenranges_http_timeout = 300

time_to_completion_target_percentage = 0

But the subrange repair didn't even finish and I see below un event log

12/16/2020, 5:11PM UTC Info Launching incremental job with 0 tasks, 0 bytes, 0:00:52 time to complete (Wed Dec 16 17:12:16 UTC 2020) and 'min_repair_time' of 7200 seconds per task, with 0:00:52 time remaining. -
12/16/2020, 5:06PM UTC Info Rolling incremental repair: there is nothing currently to repair, will try again after sleeping for 300 seconds. This period of time is configurable with [repair_service].restart_period. -
12/16/2020, 4:06PM UTC Info Launching incremental job with 0 tasks, 0 bytes, 0:00:52 time to complete (Wed Dec 16 16:07:16 UTC 2020) and 'min_repair_time' of 7200 seconds per task, with 0:00:52 time remaining.


Is there any ideal settings to run the repair every N hours?

opscenterrepair
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

OpsCenter breaks up the token ranges into small chunks in such a way that the full repair of a table completes within gc_grace_seconds (default is 10 days). This means that the repair service schedules repairs of subranges spread over this period so there are repairs running all the time.

More importantly, repairs are a normal part of the operation of Cassandra. You shouldn't be trying to stop repairs from running and let the repair service do its job. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.