sharathsai666_167144 avatar image
sharathsai666_167144 asked Erick Ramirez commented

Node didn't join the cluster after bootstrapping

Hi all,
We are facing some problems in the scaling of our nodes, previously we are having 7 nodes in our cluster. we have planned to add 7 nodes to our cluster, we are adding one by one node and successfully added 4 nodes and while adding the 5th node the bootstrap process started and after 36+ hrs the node was not part of a cluster, there is no connectivity for that node and nodetool commands are not working and checked storage and it was using 1.5 TB out of 3 TB like other nodes.

So what are steps need to follow to get back that node?

below type of logs are there in system.log

2021-08-23 04:48:36:309*[ERROR]*STREAM-OUT-/*o.a.c.s.StreamSession*logError*[Stream #07aaa190-0376-11ec-9159-45ce30811155] Streaming error occurred on session with peer Connection reset by peer
  at Method) ~[na:1.8.0_181]
  at ~[na:1.8.0_181]
  at ~[na:1.8.0_181]
  at ~[na:1.8.0_181]
  at ~[na:1.8.0_181]
  at ~[apache-cassandra-3.11.3.jar:3.11.3]
  at ~[apache-cassandra-3.11.3.jar:3.11.3]
  at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage( [apache-cassandra-3.11.3.jar:3.11.3]
  at org.apache.cassandra.streaming.ConnectionHandler$ [apache-cassandra-3.11.3.jar:3.11.3]
  at [na:1.8.0_181]
bootstrapadd nodes
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

The error you posted indicates that the stream got interrupted so it means that the bootstrap didn't complete. Unless the bootstrap operation completes, a node will not be able to join the cluster so that explains why.

You need to review the logs specifically paying attention to log entries related to the stream ID which in your case is: 07aaa190-0376-11ec-9159-45ce30811155. Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

sharathsai666_167144 avatar image sharathsai666_167144 commented ·

Thanks, @Erick Ramirez for the Reply above logs I got after tracking with that id.
So, what are the next steps do I need to follow

  • Any steps to get back that node
  • Delete the node and add a new one
  • Any Other recommendations?
0 Likes 0 ·
log.txt (801 B)
Erick Ramirez avatar image Erick Ramirez ♦♦ sharathsai666_167144 commented ·

You need to investigate why the stream failed. In the text file that you posted, there was an error reading a partition being streamed and that would have been the reason for the bootstrap to fail.

You can attempt to re-bootstrap the node with nodetool bootstrap resume but unless you identify the root cause of the failure and fix it, the bootstrap is likely to fail again. Cheers!

0 Likes 0 ·