I have a cluster of 5 nodes. Each node has about 200 GB data. The replication factor is set to 3. Whenever I run a partitioner range repair on a single node (nodetool repair -pr), the performance of the entire cluster is significantly reduced. So much that the microservices connected to the cluster receive timeouts on the majority of requests.
Each server has the following specifications:
- 8 VCPUs, 32 GB RAM, 500 GB SSD (CX51 at Hetzner).
- Cassandra version: 3.11.3
- Parallel GC with InitialHeapSize: 515899392, MaxHeapSize: 8237613056
How do I prevent a repair from overloading the cluster? I have tried lowering the compaction throughput (nodetool setcompactionthroughput) with no apparent difference.