Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

arunkolluri07_188529 avatar image
arunkolluri07_188529 asked Erick Ramirez answered

Can I add more nodes to redistribute data more evenly?

Hi Team,

We are running a 18 node cluster with which has uneven data distribution.On the 3 nodes which have more data we are getting latency issues .

Myquestion is Can i added 6 more nodes to make sure data is redistributed more evenly or can i use allocate_tokens_for_keyspace to evenly distribute the data .

Also this cluster has 250 tables should i run nodetool tablehistogram to see if the node has large partition and then remodel the table.

Any help will be much appreicated

Datacenter: us-east-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load        Tokens  Owns   Host ID                           Rack
UN  10.77.15.6   188.17 GiB  16      30.8%  cb3872-2900-4b10-8547-190eaf2965a6  2c
UN  10.77.53.8   224.27 GiB  16      39.4%  45efb4-b3bd-4da2-a6bd-51bfc2e7c86d  2c
UN  10.77.91.18  199.22 GiB  16      34.6%  7bbe11-5f1c-4dbd-9273-a4212a26211d  2a
UN  10.77.76.5   150.12 GiB  16      24.2%  c0e232-201a-4302-8f47-a802810da912  2b
UN  10.77.81.6   183.8 GiB   16      29.8%  e35867-f0ba-4af1-8c40-3431d5ef63ab  2c
UN  10.77.65.4   156.63 GiB  16      29.4%  d36dae-87dc-4fa7-a33d-1c82afc27734  2a
UN  10.77.69.13  206.26 GiB  16      36.0%  db6f71-6a57-44ee-9723-f648ed313c42  2a
UN  10.77.31.38  241.86 GiB  16      43.4%  25d109-6380-46bb-b7b0-9a9c8feca4a5  2b
UN  10.77.65.23  201.38 GiB  16      32.5%  705df9-888c-4d8d-8162-cc542c5b20a8  2b
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens  Owns   Host ID                           Rack
UN  10.77.12.93  143.66 GiB  16     27.6%  409a3c-e95e-4ebb-9dc8-7957b4877f32  2c
UN  10.77.11.15  257.9 GiB   16     44.9%  1efb50-a04c-4064-89e6-e9a1c3e19641  2a
UN  10.77.13.17  230.64 GiB  16     39.1%  594604-62f5-4b85-b566-d9e2858f992b  2b
UN  10.77.19.24  173.96 GiB  16     30.4%  8d7f50-7103-4bdd-b5c4-dbb14c49614e  2b
UN  10.77.12.16  271.3 GiB   16     47.9%  32e1e1-d405-4599-9eab-01e17c8c9c7f  2c
UN  10.77.14.62  168.47 GiB  16     27.2%  97a1d3-7c40-478b-bb8a-c96dedfb8def  2a
UN  10.77.09.81  174.54 GiB  16     27.9%  135b08-c9fa-4b16-86ec-7f314af02676  2a
UN  10.77.7.10   135.37 GiB  16     24.5%  1371ae-5073-4b58-9408-14374afd037b  2c
UN  10.77.18.84  190.6 GiB   16     30.5%  1fd2d4-7ee5-434e-990f-fbc916052a3e  2b
virtual nodes
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

Reviewing the distribution of the partitions is a good start. If the partition sizes have a really large range but for example there are 2 billion partitions then the sizes don't matter as much compared to a table which only has 1,000 partitions.

Adding nodes won't necessarily balance the data across nodes if there aren't a lot of partitions. Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.