Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

ary avatar image
ary asked ·

Does the cass-operator support multi-DC clusters based on 2 k8s clusters?

Is there any way to create a multi datacenter cluster (to use replication strategy NetworkTopologyStrategy) in different kubernetes clusters using cass-operator v1.3. In other words, we have 2 k8s clusters in different datancenters (dc1 and dc2) and we want to create a Cassandra cluster using these two datacenters.

CREATE KEYSPACE test WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 3};

3 cassandra pods in dc1 and 3 cassandra pods in dc2

Thanks,
Ary

cass-operator
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

[EDITED] The DataStax Cassandra Operator (cass-operator) has support for multi-datacenter clusters. You just need to create two CassandraDatacenter resources with the same clusterName in the spec section. But multi-region clusters are not supported in Kubernetes (at the time of writing). Allow me to explain it in detail.

Multiple Cassandra DCs

At the pure-Cassandra layer (setting aside Kubernetes for a minute), there are three minimum criteria required for C* nodes to be able to form a C* cluster:

  • They must have the same C* cluster_name.
  • They must have a common seed(s) in the seeds list.
  • They must be able to gossip with each other (shared network segment).

The same rules apply when it comes to two datacenters -- same cluster name, have a common seed and network connectivity between the DCs:

Containerised DCs

The setting described above applies to Cassandra clusters deployed in a Kubernetes cluster. It does not matter that the Cassandra nodes are in Docker containers running in Kubernetes pods.

As long as they share the same cluster name, have a common seed and have network connectivity, the Cassandra DCs can be part of the same cluster.

The cass-operator supports multi-DC in a single Kubernetes cluster regardless of whether:

  • Kubernetes worker nodes which host C* pods are in the same zone.
  • Kubernetes worker nodes are in different zones in the same region.

In this diagram, Cassandra DC A and DC B could be in the same or different zones as long as they are in the same region:

The caveat for this configuration is that you need to ensure the control plane is configured for high availability. Otherwise, the control plane is a point-of-failure if it is only located in a single zone.

Multi-region deployments

In a configuration where Cassandra DCs are deployed in multiple regions shown in this diagram:

Cassandra will operate as long as the nodes share the same cluster name, have a common seed and have network connectivity (GKE VPC in your case).

But since each DC is managed by a separate operator in each region:

  • Operator 1 can only manage the DC in region 1.
  • Operator 2 can only manage the DC in region 2.

In simpler terms, the operators don't have anything to do with each other because it requires federation which is not supported in Kubernetes. To be clear, this isn't a limitation in the cass-operator but a functionality which is not supported by Kubernetes.

If you make a configuration change in region 1, that change will only be applied to that region by operator 1. Similarly changes in region 2 will only be applied in that region by operator 2 because Kubernetes does not support federated configurations across regions.

For more information, see the official Kubernetes Best Practices document on Running in multiple zones. Cheers!


9 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

@Erick Ramirez So Cassandra is limited to our Kubernetes cluster boundaries? I think this is a cass-operator limitation. What would be the idea of defining a Cassandra multi-datacenter cluster (for example dc1 and dc2) if its nodes are going to live in the same k8s cluster (same data center)? So is not possible to have a Cassandra cluster spread between two regions using k8s? For example in GKE Central and GKE East)?

0 Likes 0 · ·

No, this is a limitation with Kubernetes and not the cass-operator. Let me update my answer. Cheers!

0 Likes 0 · ·
ary avatar image ary Erick Ramirez ♦♦ ·

It doesn't make much sense to me. Maybe that depends on the k8s cluster setup.
Best,
ary

0 Likes 0 · ·
Show more comments

Thanks @Erick Ramirez for such a great explanation. Now it's more clear to me

0 Likes 0 · ·

Good to hear. I'm glad we got there in the end. Cheers!

0 Likes 0 · ·