question

rkagan_146583 avatar image
rkagan_146583 asked Erick Ramirez edited

How to interpret "defuncting connection" errors?

Hello:


We receive various "defuncting connection" errors and not sure when we got what - due to corrupted C* nodes or due to network errors? How should we classify them?


We looked at DSE driver implementation and Solr/Lucene code, however did not find it's corresponding error message(es).


  1. Row retrieval response timeout of XXX, missing responses from nodes: [x.x.x.x]), defuncting connection
  2. Query response timeout of XXX, missing responses from nodes: [x.x.x.x]), defuncting connection
  3. Connection refused: /x.x.x.x:8609), defuncting connection
  4. Failed handshake due to exhausted XXX seconds timeout on channel [id: XX, L:/XX:58592 - R:/XX:8609].), defuncting connection
  5. Error on shard x.x.x.x: java.lang.AssertionError), defuncting connection


Many, many thanks in advance if you can clarify to us what is happening when we receive such errors.


Kindly,


Roman


search
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

@rkagan_146583 Sorry to hear you're having issues with your cluster. Port 8609 is used for internode messaging so the symptoms you described indicate that the nodes cannot communicate with each other. The most common cause of this aside from a firewall blocking the comms is when a node (or nodes) is overloaded making it unresponsive. Check for long GCs, dropped reads or writes from the problematic node. Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.