We have a cluster 16CPU/128Gb 500Gb SSD - data disk, 500Gb SSD - solr.data disk and 50Gb SSD - commit logs. Cluster with 3 DSE Search nodes. Xmx=32Gb. DSE Cassandra 6.8.10. RF=3
After some time (2-8 hours), all cluster nodes stop responding. We see a ton of messages in debug.log:
ColumnFamilyStore.java:1692 - Flushing of largest memtable, not done, max live ratio 0.32 less than min ratio 0.33
Can someone explain me what is can be and where I need to look to find root cause?
Thanks in advance.