question

patrickjp93 avatar image
patrickjp93 asked Erick Ramirez edited

Should You Change Partition Column Values to Rebalance Hot Partitions?

Problem: we have tried 2 sensible partitioning schemes for tracking current state for a fleet of sensors--one being ((sensor_id), ...) and the other ((user_id, sensor_id) ...). Both land us with a very hot partition, and we're not in a position where we can add more nodes quickly. So, can we change the value of the partition column deliberately to send noisy sensors to another partition? Should we event attempt to? Has anyone else needed to do this?

What is the better/best practice here?

data modeling
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez edited

I'm not sure I understand the problem in your post. It would be fantastic if you could update the title to be an explicit question.

I'd also like to suggest adding an opening paragraph with the problem statement in one or two sentences. That way, other contributors have a good idea on what it is exactly you're asking. Cheers!

[UPDATE] In general terms, "hot partitions" are records which are either (a) heavily updated, (b) constantly read, or (c) both. Based on this definition, they are "hot".

If you have a keyspace with a replication factor of 3 in a DC, three replica nodes for a given hot partition are constantly under load. But if it's just a 1 or 2 partitions are hot, it doesn't really matter how many nodes there are in the cluster -- it will always three replica nodes which would shows signs of stress.

However if you have a bigger proportion of hot partitions in a table then if you add a sufficient number of nodes, it is possible to alleviate the symptoms a bit although it won't necessarily solve the underlying issue for the long term. But perhaps I misunderstood what you meant by "hot partitions" so if you provide additional information, I'd be happy to update my response.

Furthermore, it isn't possible to change the partition key once you have created a table (if that's what you meant by "change the value of the partition column"). You will need to create a new table with a different primary key.

What did you mean by "send noisy sensors to another partition"? I can't see a possibility of doing that since your tables are partitioned by the sensor ID. There's no room for movement here. Again, let me know if I misunderstood your question. Cheers!

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

patrickjp93 avatar image patrickjp93 commented ·

Hopefully the new version is easier to parse.

0 Likes 0 ·