sunilrpawar7_183464 avatar image
sunilrpawar7_183464 asked Erick Ramirez commented

Dropped Mutation, Read, Read Repair in Cassandra cluster

In application logs, we are observing ReadTimeoutException along with mean cross node dropped latency to 7986ms. We are not observing the same for long duration but for 20-30 seconds.

At the same time in Cassandra logs, we can observer long GC pauses of about 7-8 seconds, messages of system.batches and pending and active NTP requests.

We have set up of 15 node cluster with the version of 3.11.2.

heap size :- 31GB, Total available is 64GB.

Swappiness is disabled and other production recommended settings are in place.

1. Is there any way to find what statements are getting executed during batches?

2. What can be the optimal value for heap in such a scenario?

3. In logs we can observe LOCAL_QUORUM is not satisfied out of 2 nodes only 1/0 node responded for write request? We have RF=3. But no specific node IP is getting tracked in logs, is it due to batches?

4. How can we gather and deep down more into the problem?

Thank you.

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

The symptoms you described indicate to me that your cluster is overloaded and you probably need to increase the capacity of your cluster by adding nodes.

The timeouts, read latency and long GC pauses are all linked and are different symptoms of the underlying load problem.

To respond to your questions directly:

  1. You need to have audit logging enabled.
  2. 31GB is the right heap size.
  3. It's happening as a result of nodes being unresponsive from the long GC pauses.
  4. You already know the symptoms of the overload problem.


2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

sunilrpawar7_183464 avatar image sunilrpawar7_183464 commented ·

Thanks, Erick.

The each of the 15 nodes is having a load of 400GB. And the problem we are facing for those quick seconds.

1. Writing batches to the cluster with multi-node partition access can make cluster overloaded?

2. We are running range repair at the same time, do it has an impact on the system as it may cause high heap memory utilization and can cause high GC pauses?

3. We have -XX: MaxGCPauseMillis=500, increasing this value to 1000ms will help to reduce to pauses? As recommended value from DataStax is between 500-2000ms.

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ sunilrpawar7_183464 commented ·

When I was talking about load it was in the context of the nodes being overloaded, not about the size of the data on the nodes.

  1. Operations (read and writes) above what the cluster can handle is what makes it overloaded.
  2. Repairs put additional load and it's not ideal when a cluster is already overloaded.
  3. The max GC pause is just a target for G1 GC. Increasing the target pause time will not make a cluster less overloaded.

You can't tune your way out of an overload. You need to increase the capacity of the cluster by adding more nodes. Cheers!

0 Likes 0 ·