PLANNED MAINTENANCE

Hello, DataStax Community!

We want to make you aware of a few operational updates which will be carried out on the site. We are working hard to streamline the login process to integrate with other DataStax resources. As such, you will soon be prompted to update your password. Please note that your username will remain the same.

As we work to improve your user experience, please be aware that login to the DataStax Community will be unavailable for a few hours on:

  • Wednesday, July 15 16:00 PDT | 19:00 EDT | 20:00 BRT
  • Thursday, July 16 00:00 BST | 01:00 CEST | 04:30 IST | 07:00 CST | 09:00 AEST

For more info, check out the FAQ page. Thank you for being a valued member of our community.


question

vranganathan avatar image
vranganathan asked ·

Cassandra v3.0.9 bootstrap failing and stuck in JOINING state even after the streaming is complete

We are in the process of scaling the cluster from 30 nodes to about 40.

Configuration for bootstrap:

auto_bootstrap: true (Default)
-Dcassandra.consistent.rangemovement=false (Default)

The bootstrap seems to be complete with all the nodes streaming data to this new node. Yet, the node seems to stay in JOINING state and bootstrapping eventually is timing out (after 3 hrs; streaming_socket_timeout_in_ms). This is an inconsistent state with the new node being stuck in UJ state forever.I tried nodetool bootstrap resume which also hangs indefinitely. I checked nodetool netstats and none of the nodes are streaming data to the new node.


Now, since I know that this node has all the data that belongs to it, I try to add auto_bootstrap: false in the cassandra.yaml and restart cassandra process. My expectation was that adding auto_bootstrap: false will not bother about streaming data from other nodes, but seems like I am missing something here.


The node seems to receive data from other nodes and the bootstrapping is starting all over again.

I went another step ahead and tried by adding the -Dcassandra.consistent.rangemovement=false along with the auto_bootstrap: false (I did this as I intermittently got RuntimeException suggesting A node required to move the data consistently is down although all the nodes were UN).

I still see that the node tries to stream data from other nodes. Am I missing something here.Would really appreciate if someone could help me out here.

3.0.9apache cassandrabootstrapping
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

vranganathan avatar image
vranganathan answered ·

Looks like `nodetool bootstrap resume` was taking a long time to complete and I assumed it was stuck (Not able to reason as to why I was not able to see any streaming going thru earlier).

All I did was to restart the cassandra process with the defaults (`auto_bootstrap: false` & `-Dcassandra.consistent.rangemovement=false` and wait until it got stuck again (I was getting a `SocketTimeoutException` consistently)..

After it was stuck I ran a `nodetool bootstrap resume` again and let it run for some time. Eventually it completed and the node joined the ring.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered ·

@vranganathan Since you have disabled consistent range movement, the likely scenario is that the bootstrap streams for the node got interrupted when you added nodes simultaneously and that the streams didn't actually complete. When you ran the resume command, it restarted the streams again. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.