amitmund_177224 avatar image
amitmund_177224 asked Erick Ramirez edited

Primary key vs Clustering column vs Partition key vs Partition token

Primary key Vs Clustering column vs Partition key vs Partition token, Need some more help.

The relation between them.

data modeling
1 comment
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image Erick Ramirez ♦♦ commented ·

Just letting you know that we've had a high volume of questions in the last 48 hours and we've been caught up with the FREE Cassandra Workshop Series so our responses are delayed but we will get to them in the next few hours. Cheers!

0 Likes 0 ·
Erick Ramirez avatar image
Erick Ramirez answered


A table's primary key is one or more columns that uniquely identify:

  1. the location of the data in a cluster of nodes, and
  2. the order of the stored data.

The primary key's first element is the partition key. It is used to determine which node holds a given table's row(s) by hashing its value into a partition token (done by the default Murmur3Partitioner which uses the MurmurHash algorithm).

A simple primary key uses just one column as the partition key. When there are 2 or more columns enclosed in parenthesis at the start of a primary key, it is known as a composite partition key.

For tables with a compound primary key, the primary key has both a partition key and one or more clustering columns.



In this table, there is only one column in the partition key (username).

    username text,
    realname text,
    email text,
    PRIMARY KEY (username)


In this table, there are 2 columns that form the partition key enclosed in parenthesis ((title, year)):

    title text,
    year int,
    description text,
    PRIMARY KEY ((title, year))

The video title on its own is not unique. For example, the 1950 release of Superman with Kirk Alyn in the leading role is not the same Superman movie released in 1978 starring Christopher Reeve so we need to append the year next to the title to make the partition key unique -- "Superman:1950" and "Superman:1978".


This table has a single-column partition key (userid) and a clustering column (email):

CREATE TABLE user_emails (
    username text,
    email_type text,
    email_address text
    PRIMARY KEY (userid, email_type)

A user can have multiple emails -- personal, work, etc.

This table has a composite partition key ((title, year)) and 2 clustering columns (commented_at and comment):

CREATE TABLE comments_by_video_title (
    title text,
    year int,
    commented_at timestamp,
    comment text,
    username text,
    PRIMARY KEY ((title, year), commented_at, comment)

Comments are sorted with most recent as the first row. In this case, we can retrieve the 10 most recent comments about a video with the following query:

SELECT comment FROM comments_by_video_title \
    WHERE title = 'Superman'
    AND year = 1978;


10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

laxmikant.hcl_32751 avatar image
laxmikant.hcl_32751 answered

Primary key: uniquely identify a row. A primary key will definitely have partition key and can optionally have clustering column

Partition key: Partition key defines where your data will locate on a node

Clustering column: defines how the data is sorted within a partition

Partition token: It is the output number of a hash function which takes partition key value as an input. Each node has a token range and hence based on partition key (or token) data will go to the node which owns it.

Example: PRIMARY KEY( city, date, id ) ... in this primary key, city is the partition key, and date and id are clustering keys. within a city, data will be sorted based on date and then id.

Let's assume a Cassandra node A which owns range (0-50), and token of London is hash(London) = 10 all the data which has the city as London will go to node A.

Hope it helps.

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.