Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

rohithsolomon avatar image
rohithsolomon asked ·

Fatal schema migration error, Can't bootstrap (all migration tasks failed or timed out)

Hi Team:

I'm hitting the following error while trying to add a new node to an existing cluster. Following the instructions from DSE 210 (on the nodes I setup in VM).

New Node Config (almost the same as the existing node with seed in the new node pointing to the existing one)

theowl@c-node1:~$ hostname -i
192.168.56.103
theowl@c-node1:~$ hostname
c-node1
theowl@c-node1:~$ sudo vi /etc/dse/cassandra/cassandra.yaml
[sudo] password for theowl:
theowl@c-node1:~$ cd /etc/dse/cassandra/
theowl@c-node1:/etc/dse/cassandra$ egrep 'cluster_name:|listen_address:|native_transport_address:|seeds:|num_tokens:|initial_token:|endpoint_snitch:' cassandra.yaml
cluster_name: 'KillrVideoCluster'
num_tokens: 8
# initial_token:
          - seeds: "192.168.56.101"
listen_address: 192.168.56.103
native_transport_address: 192.168.56.103
endpoint_snitch: GossipingPropertyFileSnitch

Error Stack:

INFO  [main] 2020-09-10 21:09:31,805  YamlConfigurationLoader.java:77 - Configuration location: file:/etc/dse/cassandra/cassandra.yaml
INFO  [main] 2020-09-10 21:09:31,821  YamlConfigurationLoader.java:77 - Configuration location: file:/etc/dse/cassandra/cassandra.yaml
INFO  [main] 2020-09-10 21:09:31,897  StorageService.java:809 - Loading persisted ring state
INFO  [main] 2020-09-10 21:09:31,904  StorageService.java:938 - Starting up server gossip
INFO  [main] 2020-09-10 21:09:31,926  YamlConfigurationLoader.java:77 - Configuration location: file:/etc/dse/cassandra/cassandra.yaml
INFO  [main] 2020-09-10 21:09:31,939  YamlConfigurationLoader.java:77 - Configuration location: file:/etc/dse/cassandra/cassandra.yaml
INFO  [main] 2020-09-10 21:09:32,057  Gossiper.java:2181 - Waiting for gossip to settle before accepting client requests...
INFO  [GossipStage:1] 2020-09-10 21:09:32,947  Gossiper.java:1317 - Node /192.168.56.101 has restarted, now UP
INFO  [GossipStage:1] 2020-09-10 21:09:32,958  Gossiper.java:1281 - InetAddress /192.168.56.101 is now UP
INFO  [GossipStage:1] 2020-09-10 21:09:32,968  Gossiper.java:1341 - WRITING LOCAL JOIN INFO to [com.datastax.bdp.util.Addresses$Internode$AddressCacheManager@2787be1d, org.apache.cassandra.gms.Gossiper$2@51356f67, org.apache.cassandra.service.StorageService@2a4397df, org.apache.cassandra.locator.ReconnectableSnitchHelper@7918506f, org.apache.cassandra.service.LoadBroadcaster@63756e07]
INFO  [GossipStage:1] 2020-09-10 21:09:32,973  StorageService.java:2931 - Node /192.168.56.101 state jump to NORMAL
INFO  [GossipStage:1] 2020-09-10 21:09:33,184  TokenMetadata.java:519 - Updating topology for /192.168.56.101
INFO  [GossipStage:1] 2020-09-10 21:09:33,192  TokenMetadata.java:519 - Updating topology for /192.168.56.101
WARN  [GossipTasks:1] 2020-09-10 21:09:33,937  FailureDetector.java:290 - Not marking nodes down due to local pause of 22617744993 > 5000000000
INFO  [main] 2020-09-10 21:09:52,164  Gossiper.java:2250 - No gossip backlog
INFO  [main] 2020-09-10 21:09:52,164  Gossiper.java:2301 - No pending echos; proceeding.  Echos failed 0, Echos succeeded 1
INFO  [main] 2020-09-10 21:09:52,165  Gossiper.java:2311 - Gossip settled; proceeding
INFO  [main] 2020-09-10 21:09:52,196  YamlConfigurationLoader.java:77 - Configuration location: file:/etc/dse/cassandra/cassandra.yaml
INFO  [main] 2020-09-10 21:09:52,209  YamlConfigurationLoader.java:77 - Configuration location: file:/etc/dse/cassandra/cassandra.yaml
INFO  [main] 2020-09-10 21:09:52,218  YamlConfigurationLoader.java:77 - Configuration location: file:/etc/dse/cassandra/cassandra.yaml
WARN  [main] 2020-09-10 21:09:52,219  StorageService.java:1020 - Detected previous bootstrap failure; retrying
INFO  [main] 2020-09-10 21:09:52,219  StorageService.java:1856 - JOINING: waiting for ring information
INFO  [main] 2020-09-10 21:09:52,220  StorageService.java:1856 - JOINING: waiting for schema information to complete
INFO  [main] 2020-09-10 21:09:52,220  MigrationManager.java:147 - Waiting for all in flight PULL schema request to finish
ERROR [main] 2020-09-10 21:09:52,227  CassandraDaemon.java:853 - Fatal schema migration error
org.apache.cassandra.exceptions.MigrationException: Can't bootstrap (all migration tasks failed or timed out)
        at org.apache.cassandra.schema.MigrationManager.waitUntilReadyForBootstrap(MigrationManager.java:150)
        at org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:978)
        at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1024)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:695)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:619)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:402)
        at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:527)
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:693)
        at com.datastax.bdp.DseModule.main(DseModule.java:96)
INFO  [StorageServiceShutdownHook] 2020-09-10 21:10:02,244  DseDaemon.java:855 - DSE shutting down...

Initially, it worked fine. I did a nodetool status and I could see the existing node's address as part of the command output. After a while, the newly added shuts down with that error.

Please help me understand what exactly is the bootstrapping issue. Any pointer to documented fixes is fine as well.

[EDIT] This node was started after installing DSE with Test Cluster config. I had to perform the following steps to avoid the error:

org.apache.cassandra.exceptions.ConfigurationException: Saved cluster name Test Cluster != configured name KillrVideoCluster

bootstrap
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

You didn't indicate which version of DSE you installed so it's a bit difficult to troubleshoot what's going on.

But the reason the bootstrap failed is the node didn't get the schema from the other node (for whatever reason). You will need to review the debug.log for clues.

You will also need to make sure that the OS user cassandra has full permissions to the data and commitlog directories. Also make sure that none of the files and subdirectories are not owned by root. Cheers!

[UPDATE] The original issue you reported is unrelated to the real issue -- that you re-provisioned the node without wiping all its data and configuration.

It isn't possible to change the cluster name on a node once it has been added to a cluster (even for single-node clusters). That's what was preventing the node from joining another cluster -- you're effectively changing the node's cluster name by specifying a different one in the cassandra.yaml.

To re-purpose a node, you need to delete all the contents of the directories including:

  • data/
  • commitlog/
  • saved_caches/
3 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hey Erick:

Am using DSE 6.7.8 (per DS210 instructions). Data and Commitlog dirs are owned by Cassandra. the only error I see in the debug log is the one I mentioned earlier.

theowl@c-node1:~$ sudo ls -ltr /var/lib/cassandra/data
total 56
drwxr-xr-x 13 cassandra cassandra 4096 Sep 10 19:27 system_schema
....
....
drwxr-xr-x 18 cassandra cassandra 4096 Sep 10 20:08 system
drwxr-xr-x  3 cassandra cassandra 4096 Sep 10 20:08 killr_video ---> The one i created during stree test on my initial node

theowl@c-node1:~$ sudo ls -ltr /var/lib/cassandra/commitlog
total 4
-rw-r--r-- 1 cassandra cassandra 28 Sep 11 18:20 CommitLog-600-1599848414710.log
0 Likes 0 · ·

This node was started after installing DSE with Test Cluster config. I had to perform the following steps to avoid the error:

org.apache.cassandra.exceptions.ConfigurationException: Saved cluster name Test Cluster != configured name KillrVideoCluster

sudo service dse stop
sudo rm -rf /var/lib/cassandra/data/system/*
sudo vi /etc/cassandra/cassandra.yaml, 
setup the proper parameters
sudo service dse start
nodetool status
1 Like 1 · ·

Was unable to detect the error. So I removed the files:

sudo sh -c 'rm -rf /var/lib/cassandra/*' 

and brought the node back online. Since no IP changes were made, it joined the cluster.

0 Likes 0 · ·