question

rajib76 avatar image
rajib76 asked Erick Ramirez commented

How does a newly added node in Astra DB get a copy of the data in the cluster?

When a new node gets added to ASTRA DB(which I understand is transparent to the users), how does it know which data to load. Also, does it connect to seed nodes and maintain cluster map

astra db
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Patcho2005 avatar image
Patcho2005 answered Erick Ramirez commented

Hi - the AstraDB architecture is a little different to the traditional C* architecture. AstraDB separates the compute and storage of the nodes so that the architecture is truly serverless - there is a whitepaper that helps to explain this. AstraDB

I hope this helps.

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

You're right! Thanks for being a part of the community. Your contribution is very much appreciated. Cheers!
0 Likes 0 ·
Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

Astra DB is a serverless cloud-native Cassandra database-as-a-service (DBaaS) meaning that there are no servers (nodes) involved in the traditional sense that most users are accustomed to.

The compute functions that traditional Cassandra nodes provide (including query coordination, reading and writing data, compaction, repair, etc) are decoupled/separated from the data.

Astra DB runs on Kubernetes and the compute functions run as services to operate on a separate storage layer. Here is a diagram that illustrates the Astra DB architecture (source: Astra DB Serverless Whitepaper):

c13132-astra-microservices-architecture.png

In a serverless architecture, the compute instances scale independently from the storage layer. To answer your question directly, there are no Cassandra "nodes" in Astra DB so we don't add new nodes. Instead, we scale the Coordination Services and/or Data Services to match the throughput as required.

If you're interested in finding out more, the Astra DB Serverless Whitepaper discusses this in great detail. Cheers!


3 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks Erick, for the document. I went through the document. Based on that I am summarizing my understanding below. Please let me know if I have understood this correctly.

ASTRA follows a microservices based architecture. So, every service(including data service) is running in a container in K8s pod. All these services are stateless services. So, when there is an increase in load, ASTRA can spin up another POD with the service. Since the service is stateless, it does not need to know the state of the cluster immediately. The state of the cluster is managed by ETCD. So the new services will contact ETCD to get the cluster information automatically.

Based on the above understanding, I have the below follow up questions

1. Will the service be non-functional until it has been loaded with all the cluster related information(for example replaying the local commit log, getting the details of the SSTABLES)

2. How long does it take for the service to be functional once it is spin up

0 Likes 0 ·
I would say instantaneous. Are there something else that you're trying to achieve with the Database-as-a-Service (DBaaS) offering from your application side? Feel free to take a look at the features of AstraDB that you could leverage to build your cloud-native application
0 Likes 0 ·
There is no concept of "Cassandra nodes" in Astra's serverless architecture so you shouldn't think of the Astra services as the equivalent of nodes. There's also no concept of "replaying commitlogs" so this isn't relevant in Astra.

A Kubernetes pod (Astra service) is operational and ready to service requests as soon as it comes online. It's just how Kubernetes works and isn't specific to Astra. Cheers!

0 Likes 0 ·