Does the order of fields a and b in multi partition key matter for data distribution?
Bringing together the Apache Cassandra experts from the community and DataStax.
Want to learn? Have a question? Want to share your expertise? You are in the right place!
Not sure where to begin? Getting Started
@praneethk29_179300 The order of fields does matter in multi partition key.
Tokens are hash values that partitioners use to determine where to store rows on each node. This value determines the node's position in the ring and what data the node is responsible for.
For Example in a 3 node cluster create 2 tables and switch the partition key order:
create table killer_video.email_by_title (email text,title text,year int,PRIMARY KEY ((email, title)));
create table killer_video.title_by_email (title text,email text,year int,PRIMARY KEY ((title, email)));
insert some data:
insert into killer_video.email_by_title (email,title,year) values ('email@example.com','avatar',2009);
insert into killer_video.title_by_email (title,email,year) values ('avatar','firstname.lastname@example.org',2009);
Get the endpoints that own the partition key as detailed in nodetool getendpoints
-bash-4.2$ nodetool getendpoints killer_video title_by_email "avatar:joe@test" 10.142.0.4
-bash-4.2$ nodetool getendpoints killer_video email_by_title "email@example.com:avatar" 10.142.0.3
The 2 partition keys (the orders are switched) are owned by different nodes.
5 People are following this question.