Build Cloud-Native apps with Apache Cassandra

GOT QUESTIONS from the Workshop? You're in the right place! Post a question here and we'll get you answers.

Click here for Week 7 Materials and Homework.

Registrations still open!


question

veeraragavan.g_193502 avatar image
veeraragavan.g_193502 asked ·

Is the primary key part of the partition key?

Is primary key is part of partition key while design at table creation

data modeling
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

study.aakansha2019_193516 avatar image
study.aakansha2019_193516 answered ·

A primary key uniquely identifies a row.

A composite key is a key formed from multiple columns.

A partition key is the primary lookup to find a set of rows, i.e. a partition.

A clustering key is the part of the primary key that isn't the partition key (and defines the ordering within a partition)


The PRIMARY KEY definition is made up of two parts: The Partition Key and the Clustering Columns. The first part maps to the storage engine row key, while the second is used to group columns in a row.

1 comment Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Nice answer! Thanks for being part of the Community. Cheers!

1 Like 1 · ·
andrew.hogg avatar image
andrew.hogg answered ·

The primary key consists of two parts - the partition key and the clustering key.


The partition key is used to determine data locality, e.g. where it is stored, and is also needed to query the data. When specified the clustering key is used to order the data within a single partition. The primary key must still be 'primary' e.g. uniquely identify a record.


Patrick has a great write up : https://www.datastax.com/blog/2016/02/most-important-thing-know-cassandra-data-modeling-primary-key

1 comment Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Nice answer! Thanks for being part of the Community. Cheers!

1 Like 1 · ·
Erick Ramirez avatar image
Erick Ramirez answered ·

Looks like you have a lot of fans with all the answers here. :)

The primary key must include a partition key and can include (optional) one or more clustering keys. And yes, the primary key must be defined at the time that you create a table because it can not be changed once the table is created.

I've answered a similar question recently (see #6171) so let me add to the list of awesome responses by reposting it here.

Definitions

A table's primary key is one or more columns that uniquely identify:

  1. the location of the data in a cluster of nodes, and
  2. the order of the stored data.

The primary key's first element is the partition key. It is used to determine which node holds a given table's row(s) by hashing its value into a partition token (done by the default Murmur3Partitioner which uses the MurmurHash algorithm).

A simple primary key uses just one column as the partition key. When there are 2 or more columns enclosed in parenthesis at the start of a primary key, it is known as a composite partition key.

For tables with a compound primary key, the primary key has both a partition key and one or more clustering columns.

Examples

SIMPLE PRIMARY KEY

In this table, there is only one column in the partition key (username).

CREATE TABLE users (
    username text,
    realname text,
    email text,
    PRIMARY KEY (username)
)

COMPOSITE PARTITION KEY

In this table, there are 2 columns that form the partition key enclosed in parenthesis ((title, year)):

CREATE TABLE videos (
    title text,
    year int,
    description text,
    ...
    PRIMARY KEY ((title, year))
)

The video title on its own is not unique. For example, the 1950 release of Superman with Kirk Alyn in the leading role is not the same Superman movie released in 1978 starring Christopher Reeve so we need to append the year next to the title to make the partition key unique -- "Superman:1950" and "Superman:1978".

COMPOUND PRIMARY KEY

This table has a single-column partition key (userid) and a clustering column (email):

CREATE TABLE user_emails (
    username text,
    email_type text,
    email_address text
    ...
    PRIMARY KEY (userid, email_type)
)

A user can have multiple emails -- personal, work, etc.

This table has a composite partition key ((title, year)) and 2 clustering columns (commented_at and comment):

CREATE TABLE comments_by_video_title (
    title text,
    year int,
    commented_at timestamp,
    comment text,
    username text,
    PRIMARY KEY ((title, year), commented_at, comment)
) WITH CLUSTERING ORDER BY (commented_at DESC)

Comments are sorted with most recent as the first row. In this case, we can retrieve the 10 most recent comments about a video with the following query:

SELECT comment FROM comments_by_video_title \
    WHERE title = 'Superman' \
    AND year = 1978;

Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

bettina.swynnerton avatar image
bettina.swynnerton answered ·

Hi @veeraragavan.g_193502,

every Cassandra table definition needs a primary key.

Every primary key needs a partition key. You can add additional clustering keys to the primary key, this is optional. But a partition key is essential. The first field listed in the primary key is the partition key. The partition key is responsible for determining the data locality in your cluster, a key concept for Cassandra as a distributed database.

In its simplest form, with basic primary keys, the partition key is the primary key.

For example here:

CREATE TABLE my_keyspace.table1 (
    id text,
    data text,
    PRIMARY KEY (id)
)

If you add a clustering key:

CREATE TABLE my_keyspace.table2 (
    id text,
    ckey text,
    data text,
    PRIMARY KEY (id, ckey)
)

This is still a good blog on the topic of primary keys:

https://www.datastax.com/blog/2016/02/most-important-thing-know-cassandra-data-modeling-primary-key

I hope this helps to answer your question.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.