Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

vijayakumar.sithamohan1_178594 avatar image
vijayakumar.sithamohan1_178594 asked ·

How can I scale Cassandra, Search and Analytics independently in a DSE cluster?

Hi,

I am looking for documentation about various deployment strategies. like how can I scale individual components (DSE-SEARCH, SPARK, CASSANDRA) independently. What are all the best practices?

Thanks



graphdeployment
2 comments
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

It very depends on the infrastructure you are going to use. Could you provide more details?

0 Likes 0 · ·

I am looking for inhouse/on premise deployment.

0 Likes 0 · ·
Erick Ramirez avatar image
Erick Ramirez answered ·

@vijayakumar.sithamohan1_178594 There is nothing special about those workloads -- they are isolated into logical data centres but apart from that, the concept of DCs is the same as pure Cassandra except they handle different workloads.

You can increase the capacity of each DC by adding nodes independent of the other DCs. It all depends on how much capacity you require.

Let us know if there is something specific you're after and we'd be happy to answer. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Cedrick Lunven avatar image
Cedrick Lunven answered ·

@vijayakumar.sithamohan1_178594 completing Erick answer.

As you know the workloads Search, Graph and Analytics (Spark) are part of the same DSE binary. As such, you will install the same solution in all nodes, nothing is associated to a workload.

Depending on the workload you enabled on nodes you want to check your hardware :

  • Spark is happy with more CPU
  • Every workload need more RAM

you can find the recommended values here :

https://docs.datastax.com/en/dse-planning/doc/planning/capacityPlanning.html


Technically speaking, you want to enable the same workloads for nodes of a same DataCenter (DC). You can have dedicated DC for Analytics and OLAP queries, Search/Cassandra OLTP CRUD are often in the same DC but search queries are simply with consistency level LOCAL_ONE. Also a keyspace can live in 2 DC with different workload on each DC. No PB.


My2c

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.