Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

pi_165798 avatar image
pi_165798 asked ·

Cluster suddenly dropped a keyspace

Hello,

We are hosting a Cassandra cluster with 5 nodes and around 2 TB of data. Cassandra version 3.11.6. We run weekly repairs scheduled by Cassandra Reaper. This morning the main keyspace was suddenly dropped, and all data was deleted. We have no idea why this happened. Looking through the logs, we found this entry:

INFO  [Native-Transport-Requests-13] 2021-03-01 09:46:42,275 MigrationManager.java:495 - Drop Keyspace 'helios'

There were also some errors related to insufficient disk space:

WARN  [CompactionExecutor:83038] 2021-03-01 09:21:35,070 CompactionTask.java:356 - Not enough space for compaction, 101305.734MB estimated.  Reducing scope.

ERROR [CompactionExecutor:83038] 2021-03-01 09:21:35,118 CassandraDaemon.java:235 - Exception in thread Thread[CompactionExecutor:83038,1,main]
java.lang.RuntimeException: Not enough space to write 57.776GiB to /var/lib/cassandra/data (47.092GiB available)

Could this error have caused the keyspace to be dropped? It seems like unlikely behaviour.

Thankfully, we managed to recover the data due to the auto_snapshot parameter. But we are still very worried about how this could have happened. We have since ip-restricted all incoming traffic to the nodes.

schema
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

There is no operation/functionality/feature in Cassandra that would cause a keyspace to be dropped. If it did, that would absolutely be catastrophic for a database to do.

If you look closely at the log entry, the thread that reported the keyspace getting dropped was Native-Transport-Requests:

INFO  [Native-Transport-Requests-13] 2021-03-01 09:46:42,275 MigrationManager.java:495 - Drop Keyspace 'helios'

If you're not already aware, native transport in Cassandra is the native binary protocol (aka CQL). This means that the keyspace drop was initiated by a CQL client -- cqlsh, CQL tools such as DevCenter or DataStax Studio, application, etc.

Interestingly, I responded to an identical question on the Cassandra user mailing list exactly about the same issue so I'll provide the same answer here.

Since it came as a CQL request, the keyspace didn't get randomly dropped -- some operator/developer/daemon/ orchestration tool/whatever initiated it either intentionally or by accident.

I've seen this happen a number of times where a developer thought they were connecting to a dev/staging/test environment and issued a DROP or TRUNCATE not realising they were connected to production. Not saying this is what happened in your case but this should give you some ideas on where to focus your investigation. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.