When a new node gets added to ASTRA DB(which I understand is transparent to the users), how does it know which data to load. Also, does it connect to seed nodes and maintain cluster map
When a new node gets added to ASTRA DB(which I understand is transparent to the users), how does it know which data to load. Also, does it connect to seed nodes and maintain cluster map
Hi - the AstraDB architecture is a little different to the traditional C* architecture. AstraDB separates the compute and storage of the nodes so that the architecture is truly serverless - there is a whitepaper that helps to explain this. AstraDB
I hope this helps.
Astra DB is a serverless cloud-native Cassandra database-as-a-service (DBaaS) meaning that there are no servers (nodes) involved in the traditional sense that most users are accustomed to.
The compute functions that traditional Cassandra nodes provide (including query coordination, reading and writing data, compaction, repair, etc) are decoupled/separated from the data.
Astra DB runs on Kubernetes and the compute functions run as services to operate on a separate storage layer. Here is a diagram that illustrates the Astra DB architecture (source: Astra DB Serverless Whitepaper):
In a serverless architecture, the compute instances scale independently from the storage layer. To answer your question directly, there are no Cassandra "nodes" in Astra DB so we don't add new nodes. Instead, we scale the Coordination Services and/or Data Services to match the throughput as required.
If you're interested in finding out more, the Astra DB Serverless Whitepaper discusses this in great detail. Cheers!
Thanks Erick, for the document. I went through the document. Based on that I am summarizing my understanding below. Please let me know if I have understood this correctly.
ASTRA follows a microservices based architecture. So, every service(including data service) is running in a container in K8s pod. All these services are stateless services. So, when there is an increase in load, ASTRA can spin up another POD with the service. Since the service is stateless, it does not need to know the state of the cluster immediately. The state of the cluster is managed by ETCD. So the new services will contact ETCD to get the cluster information automatically.
Based on the above understanding, I have the below follow up questions
1. Will the service be non-functional until it has been loaded with all the cluster related information(for example replaying the local commit log, getting the details of the SSTABLES)
2. How long does it take for the service to be functional once it is spin up
A Kubernetes pod (Astra service) is operational and ready to service requests as soon as it comes online. It's just how Kubernetes works and isn't specific to Astra. Cheers!
7 People are following this question.
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2023 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
Privacy Policy Terms of Use