Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

Valuser avatar image
Valuser asked Erick Ramirez edited

Is the number of partitions the same as number of rows in a table?

[FOLLOW UP QUESTIONS TO #9916]

In your explanation, when you added email for 'Alice' partition key , for the same partition key there will be 2 rows in table right? Like now after is addition you have total number of keys estimate as 2.

Then eventually is 'number of partitions (estimate)' = number of total rows (estimate)in a table?

I had a test table in which i hade 20k records and number of keys (estimate) was 3 . I believe this 3 is from the partition key

I think i am missing something .please update me

could you explain the diff b/w total partitions and total number of rows in a table based on the example context?

cql
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez edited

CQL partitions are not the same as rows in partitions.

In relational databases, tables are two-dimensional where each record is a row in the table. In Cassandra, tables are not just two-dimensional but can be multi-dimensional.

For tables where the primary key only contains the partition key, each partition can only contain one row so the number of partitions are the same as the number of rows in the table. But for tables which have a compound primary key (partition key + clustering columns), each partition can have one or more rows.

I've explained this in a bit more detail with accompanying examples in Primary key vs Clustering column vs Partition key. I would recommend having a look at that post because it will clarify the concepts for you.

The example table I used in question #9916 has a simple primary key so each partition only contains one row:

CREATE TABLE community.users (
    name text PRIMARY KEY,
    address text,
    email text,
    mobile text
)

In contrast, here is an example table which has a compound primary key:

CREATE TABLE user_emails (
    username text,
    email_type text,
    email_address text
    ...
    PRIMARY KEY (userid, email_type)
)

Each user (partition) in this table can have multiple emails (rows).

The output of nodetool tablestats only provides an estimate of partitions. It does not provide an estimate of rows.

To respond to your question directly:

In your explanation, when you added email for 'Alice' partition key , for the same partition key there will be 2 rows in table right?

No, that's incorrect. There is only one row where the partition key is Alice and the email was added to the same row. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.