Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

nag9sri_139755 avatar image
nag9sri_139755 asked ·

Will data be stored on the same node if 2 tables have the same partition key?

I had two following tables ( taken from Cassandra Definitve Guide , https://gist.github.com/jeffreyscarpenter/761ddcd1c125dfb194dc02d753d31733 } - What is guaranteed with respect to the folloowing tables assuming they had the same partition key ?

1. Can we safely assume the data for both the tables present in the same node as long as the partition key is same ? as both tables contain same partition key.

2. Ok , and as tables are different from each other , will they be stored in different partitions or same partition in the "same" node

CREATE TABLE hotel.pois_by_hotel (
    poi_name text,
    hotel_id text,
    description text,
    PRIMARY KEY ((hotel_id), poi_name)
) WITH comment = 'Q3. Find pois near a hotel';
CREATE TABLE hotel.available_rooms_by_hotel_date (
    hotel_id text,
    date date,
    room_number smallint,
    is_available boolean,
    PRIMARY KEY ((hotel_id), date, room_number)
) WITH comment = 'Q4. Find available rooms by hotel / date';


cassandra
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

alex.ott avatar image
alex.ott answered ·

1. if both tables have the same partition key, then the same value will be mapped into the same token. If tables are in the same keyspace, then yes - they will be on the same node(s). If they are in the different keyspaces, then there could be a partial overlap, if replication factor is different, for example, one keyspace has higher RF.

2. Each table will have its own set of the files on disk, so although they have the same "logical partitions", on disk they are in different files. You can always look into data files, something like, /var/lib/cassandra/data/<keyspace>/<table>-<table-uuid>/

3 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

So, does it mean, the combination of key space and partition is what defines tables allocation to the node ( however, there could be a overlap based on RF) ?

0 Likes 0 · ·
alex.ott avatar image alex.ott nag9sri_139755 ·

partition key value is used to calculate the token value. Token value belongs to token range that is mapped to specific host (primary replica). If keyspace has RF > 1, then other hosts also could be used to store replicas. So it's always guaranteed that all tables will have the same primary replica & stored on the same host if their keyspaces are replicated to specific DC. If we have one keyspace with RF=2 & another with RF=5, then replicas for first will be on nodes 1,2 (just example), and for another on 1,2,3,4,5 - so there is some overlap, but not complete

0 Likes 0 · ·

@nag9sri_139755, you might also want to read the below resources for better understanding,

0 Likes 0 · ·