sunilrpawar7_183464 avatar image
sunilrpawar7_183464 asked sunilrpawar7_183464 commented

What is the correct way of running a subrange repair?


I am running subrange repair for Cassandra over the tablewise repair.

sub-range repair is running with the below command:-

$ nodetool repair -pr -st XXXXX -et XXXXXX --job-threads 4

Is it the correct way to do it?

For sub-range repair as I do not have much data available in the Cassandra am getting the range of the toke for start and end by using below command:-

1. Run nodetool ring command.

2. Divide the range in pair of two for start and end sequentially.

e.g. if the first range is 1234 followed by next is 1239 then the command will be

$ nodetool repair -pr -st 1234 -et 1239 --job-threads 4

3. Repeat the process on all nodes.

Is it a right way to do it?

How can we efficiently get right parameter values for start and end token to run it without error?

I have read that we can run CQL query against 'local' or 'peer' table in 'system' keyspace to get the token values associated with the node, if it's right how can we use them for subrange repair start and end token?

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered sunilrpawar7_183464 commented


First of all, let me enumerate the types of repairs:

  • Full (-full) - compares all SSTables on a node with data on other replicas and makes the necessary repairs
  • Incremental (-inc) - marks SSTables with a repairedAt flag and doesn't repair them again, only new SSTables since the last successful run
  • Parallel - repairs all nodes with the same replica at the same time
  • Sequential (-seq) - repairs one node at a time
  • Partitioner range (-pr) - only repairs the range where the node is the primary owner so the specified range only gets repaired once (recommended)
  • Endpoint range - repairs all partition ranges on a node including ranges owned by other replicas so results in ranges being repaired multiple times (inefficient and not recommended)
  • Subrange repair (-st & -et) - repairs the range between the given start and end tokens

[Source: Manual repair in Cassandra]

The reason I pointed these out is you keep referring to "tablewise" repair which isn't a type of repair and I'm not sure what you mean by it.

Incompatible repairs

More importantly, I'm hoping the above definitions make it clear to you that you are mixing two types of repairs in one command -- partitioner range (-pr) and subrange repair (-st & -et). These two types of repairs are incompatible.

The partitioner range repair picks the token range where the node is the primary owner. When you specify the -st and -et tokens, this range will be in conflict with the range in -pr. If you want to do a subrange repair, just specify the -st and -et tokens.


Notice in the types of repairs above that only one type is recommended -- partitioner range repair (-pr). The reason for this is that it is the most efficient type of repair and is the most fool-proof. As I've previously stated, the reasons for this is explained in detail in Repairs in Cassandra.

Subrange repairs is not recommended because most users do not know how to pick and calculate the correct ranges. In a lot of cases, they miss repairing certain ranges and wonder why their data is out-of-sync. There are advantages to running subrange repairs but it is best left to experts. If you really want to run it, we recommend that you use a free open-source tool like Cassandra Reaper.

As a final point, beware of running 4 repair threads with --job-threads 4. It puts additional strain on the nodes and is unnecessary when you run repairs regularly. Cheers!

5 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

sunilrpawar7_183464 avatar image sunilrpawar7_183464 commented ·

Thanks, @Erick Ramirez for the detailed explanation.

As per my understanding, we need to run a repair on each node if we are specifying -pr option in the command. As you have stated very clearly -pr and -st -et are incompatible with each other. Then running subrange repair from any single node must be repairing all token ranges specified in the script on all nodes available in the Cluster.

If my above understanding is correct then putting all ranges in one shell script and running it only on one node should solve our purpose of running repair?

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ sunilrpawar7_183464 commented ·

Unfortunately, your understanding is incorrect. If you think about it, your approach is counter-productive. If you are just going to specify all the token ranges then why bother calculating them in the first place? Might as well run the recommended partitioner range repair.

I can't stress it enough -- use the recommended repair -pr. There's a reason why the docs state that subrange repairs are a bad choice since they're for experts. If you really insist on running subrange repairs, don't reinvent the wheel and use the free open-source tools available.

If it's not clear, I'm really trying to help you by saving you from yourself. Cheers!

0 Likes 0 ·
sunilrpawar7_183464 avatar image sunilrpawar7_183464 Erick Ramirez ♦♦ commented ·

Our full repair is getting failed due to frequent and long GC pauses. So as an alternative solution we are trying to go with subrange repair.

Our current version in 3.11.2 which needs to be upgraded to the 3.11.6 at lest to make tablewise repair to work (As suggested by you in one of the previous questions).

0 Likes 0 ·
Show more comments