Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

yekta1 avatar image
yekta1 asked Erick Ramirez answered

Composite partition key is not working

I have a playlist table like this:

CREATE TABLE demo.playlist_items (
        username  text,
        playlist_id  text,
        date_added  timestamp,
        PRIMARY KEY ((username, playlist_id) , date_added)
 )
 WITH CLUSTERING ORDER BY (date_added DESC);

as you can see, I use "username, playlist_id" as the partition key, and sort by date_added.

my problem is when I insert the same "username, palylist_id" twice, it's creating a new row every time.

the only way it works is if I remove the date_added, but that way I can't sort.

data modeling
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

It works as expected. :)

When you have a clustering column in your primary key, what you're doing is "clustering" (grouping) the data into rows inside the partition. So with a PRIMARY KEY like:

    PRIMARY KEY ((username, playlist_id), date_added)

you'll end up with multiple rows of date added for each partition:

 username | playlist_id | date_added
----------+-------------+---------------------------------
    alice |      abc123 | 2022-02-02 20:11:36.795000+0000
    alice |      abc123 | 2010-06-05 20:10:16.616000+0000
    alice |      abc123 | 2004-10-07 16:49:55.696000+0000
      bob |      xyz789 | 2014-01-11 09:26:14.981000+0000
      bob |      xyz789 | 2010-04-16 18:53:01.798000+0000

In the example above, Alice has 3 rows and Bob has 2 rows.

If you're only interested in when the last playlist was added, don't add a clustering column to the PRIMARY KEY:

CREATE TABLE community.playlist (
    username text,
    playlist_id text,
    date_added timestamp,
    PRIMARY KEY ((username, playlist_id))
)

When you write to the table, there will only ever be one row in each partition:

 username | playlist_id | date_added
----------+-------------+---------------------------------
    alice |      abc123 | 2022-02-02 20:11:36.795000+0000
      bob |      xyz789 | 2014-01-11 09:26:14.981000+0000

The difference is that your table has a compound primary key:

    PRIMARY KEY ((username, playlist_id), date_added)

whereas the playlist table in my example only uses has a composite partition key:

    PRIMARY KEY ((username, playlist_id))

I've explained the differences in more detail in this post. I should also point out that datastax.com/dev has free tutorials that would be good for you. There's a short tutorial on Data Modeling which has diagrams and a nice explanation of the concepts. Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.