We are running a 3-node Cassandra cluster with one of our nodes facing some issues with compaction.
- LCS is used for the table that compaction is lagging
- currently, there are about 1600+ compactions pending and the number is decreasing slowly.
- this occurred after we tune that node to increase its JVM heap, concurrent_writers, concurrent_compactors and compaction_throughput.
- the other nodes that were not tuned, they are performing fine
- right now, we have set compaction_throughput to 0 to unthrottle the compaction.
Originally, we thought that by increasing the compaction_throughput, the compaction will perform faster. However, that is not that case and after we tracked various disk i/o statistics -- iostats, dstats, we realised that the cpu, disk i/o is not saturated.
Output for iostats:
avg-cpu: %user %nice %system %iowait %steal %idle 1.07 6.01 0.74 0.17 0.00 92.02
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sda 1135.75 35.43 28.82 519157965 422340399 sda1 1135.75 35.43 28.82 519157472 422340399
Therefore, we are not very sure where the bottleneck is. Does anyone has any suggestion on what we can possibly do to resolve this? Thanks!