question

victor_188679 avatar image
victor_188679 asked victor_188679 commented

How do you manage schema changes in Cassandra?

Are there any good resources about managing schema changes in Cassandra? In our relational database, we have added/removed columns and indexes number of times over the development cycles. Sometimes we need to migrate data, but overall the effort is minimal since the data is normalized. But in Cassandra, the same data may be duplicated in multiple places, it seems difficult to change the schemas once they are defined.

UPDATE: I don't have any specific cases since still evaluating it. In general, it seems how data models are defined influence how the data is store in Cassandra, so changes to the schema would require to reinsert the same data, for example when a new column is added to the primary key.

cassandra
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez rolled back

@victor_188679 Your question is a little too open-ended and general so it makes it difficult to answer so I'll try my best.

Adding and removing columns in a table isn't really an issue. If you have denormalised views of the table (i.e. modelled differently but contains the same duplicated data), then yes you will need to alter those tables accordingly. The same applies to indexes -- you can add new columns or remove indexes for columns you no longer need.

UPDATE: You cannot change the primary key of a table once it's been created. The partition key and primary key determine how data is stored and how it is distributed (partitioned) across the nodes in the cluster so it cannot be changed.

If you need to change the primary key, it means that you have a completely new table so you will have to create a brand new one. Whenever you have a new application query, you will always need to design a new table for that new query. Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Gangadhara M.B avatar image
Gangadhara M.B answered victor_188679 commented

We had cluster with 15 nodes , when there were DDL changes deployed/executed through one node some time we used to get error in the log file of other node .

just Before the actual DDL changes get propagated to all other nodes , if some other node/s tries to refer/access the newly added DDL changes we used to get errors .

Solution was to restart the Cassandra service on the node where error used to appear , after restart error disappears

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

victor_188679 avatar image victor_188679 commented ·

thank you for sharing!

0 Likes 0 ·
Ryan Quey avatar image
Ryan Quey answered victor_188679 commented

This isn't an answer really, but just some resources. I recently had a similar question, and put together a list of resources for personal reference. Here's what I found:

Articles

Libs


Take a look at Cassandra.link under tools for more. It has other tools related to schema management, among other things. Hope this helps!

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

victor_188679 avatar image victor_188679 commented ·

thank you, will look into it

0 Likes 0 ·