Hi,
i'm experimenting with a cassandra (3.11.4) cluster with two datacenters, where each of them has 5 cassandra nodes. Replication is 3 for each of DC and NetworkTopologyStrategy is set. Each node has its own rack defined. I'm using the DCAwareRoundRobinPolicy in application.
When I shut down one or more node then I can see growing CPU load on all other nodes (at about 8-10% per missing node). When I provoke losing quorum in one DC then cpu load on remaining nodes are reaching 50%. I do not see any particular error or hint what is going on. Debug log shows only that there are connection attempts to missing servers (connection refused). The "application" load on the DB is non-existant or very little.
Is this behaviour normal?
What should I do when I'm expecting that I loose 2-3 instances in one DC for a few days?
Should I plan for reconfiguring the topology after loosing nodes, so that the rest does not try to connect missing nodes?
thanks and regards
Wadim