automaniac.raj28_50731 avatar image
automaniac.raj28_50731 asked Erick Ramirez edited

TPC/all/WRITE_MEMTABLE_FULL active thread spikes leading to hung state of DSE node

Hello Team,

There are 5 nodes currently in a DSE Search cluster.

Below is observed on a particular node


Pool Name    Active   Pending (w/Backpressure)   Delayed   Completed   Blocked  All time blocked

TPC/all/WRITE_MEMTABLE_FULL   1928         N/A (N/A)         N/A           1928       N/A                N/A

TPC/all/WRITE_REMOTE           0           12218 (N/A)        N/A        13289068       N/A                 0

Whenever the above pattern is seen in logs, The WRITE_MEMTABLE_FULL active threads spike and WRITE_REMOTE pending threads spike, we often see a node enters a hung state.


a) Opscenter agent stops reporting the node as UP

b) Agent fails to access Cassandra over port 9042 and reports the same in agent.log

c) cqlsh login fails with a timeout

d) DSE process keeps running. Nodetool commands do work during this phase

e) Nodetool status dosent report the node to be down

f) With WRITE_MEMTABLE_FULL active threads spiking, tried running a manual nodetool flush. The thread count goes down but eventually spikes again.

g) Restart of opscenter agent dosent help

h) A node restart turns out to be the only solution.

Below is the typical Error message encountered in logs when this scenario occurs:

In system.log

ERROR [MessagingService-Incoming-/] 2019-12-09 14:26:30,442 - java.util.concurrent.RejectedExecutionException while receiving WRITES.WRITE from /, caused by: Too many pending remote requests!

In agent.log

ERROR [] 2019-12-17 17:06:19,829 No active cassandra connections to write rollups

The node is a 16 core VM

DSE version: 6.0.9

Mode : Search

Memory: 110GB

HEAP : 31 GB

No memtable setting is defined in the cassandra.yaml:

# memtable_heap_space_in_mb: 2048
# memtable_offheap_space_in_mb: 2048
# memtable_cleanup_threshold: 0.2
memtable_allocation_type: heap_buffers
# commitlog_total_space_in_mb: 8192
# memtable_flush_writers: 4

Request you to comment on what may be the RCA of this and how do I go about fixing this up.

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

@automaniac.raj28_50731 The symptoms you described indicate the node gets overloaded and cannot keep up. The RejectedExecutionException gets thrown because the disk is not able to keep up with the writes. In this situation, you will see dropped mutations in the logs and when you run nodetool tpstats.

The number of active WRITE_MEMTABLE_FULL goes up because the write requests are stuck waiting for space to free up in the memtables. The overload makes the node appear unresponsive so cqlsh connections timeout and the agent reports the node as being down. Restarting DSE appears to "fix" the problem but all that really does is clear the request queues which were full so the node starts responding to requests again.

You need to make sure that Solr indexes are on a separate disk from the commitlog directory so they are not competing for the same IO bandwidth. If you are bulk loading data, throttle the load to a point where the nodes are not overloaded. The only real solution is to size your cluster correctly and add more nodes to increase capacity. Cheers!

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.