omar avatar image
omar asked Erick Ramirez answered

Unable to delete multiple rows, getting "Some partition key parts are missing: identifier"

I'm new to Cassandra and I've been having some issues trying to delete multiple rows in table. I have a table defined as follows:

CREATE TABLE aze.isis_users (
    identifier text PRIMARY KEY,
    iteration_count int,
    password_expires_on date,
    passwordword blob,
    roleidentifier text,
    salt blob

I would like to be able to delete all rows with roleidentifier = lim_user

1643792244354.pngI get the following 2 errors with 2 different query :



1643792244354.png (50.2 KiB)
1643792309611.png (28.5 KiB)
1643792331259.png (33.7 KiB)
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

starlord avatar image
starlord answered

Hi Omar,

Similar to when you query your data in Cassandra, you'll need to specify your primary key to perform a delete, in this case 'identifier'. To make your example work it would be:

delete from 'isis_users' where identifier = '<some_text>' and roleidentifier = 'lim_user';
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered

In Cassandra, database records known as "partitions" are distributed randomly across nodes in the cluster. The partition key (row ID in traditional RDBMS) is used by Cassandra to determine which node the partition (record) is stored. In order to delete a partition, you need to specify the partition key.

Understanding partitions and how data is distributed is critical to learning why Cassandra is different to traditional RDBMS. Patrick McFadin explains the concept of partitions in this short video in great detail with examples and diagrams which I highly recommend. It is an extract of the free DS201 Foundations of Apache Cassandra course on DataStax Academy.

For what it's worth, I've previously explained the differences between partition keys and primary keys in question #6171 if you're interested. I've included examples to make it easier to understand the concepts.

In your table, the partition key (which also happens to be the primary key) is the column identifier. Your delete statement:

DELETE FROM isis_users WHERE roleidentifier = "lim_user"

does not include identifier in the WHERE clause so Cassandra does not know which partition(s) to delete. This is the reason you are getting the error:

Some partition key parts are missing: identifier

It is not possible to filter with roleidentifier since this column is not in the primary key. DELETE statements can only contain columns which are in the primary key.

So why doesn't it work the same as RDBMS? The short answer is that Cassandra solves difficult problems of scalability and availability. Imagine you had 2000 nodes in your cluster with billions or trillions of records in a table all scattered across the nodes. If Cassandra didn't require you to specify the partition key, it will need to perform a full table scan on all 2000 nodes in order to find the records you're after. This works in traditional relational databases because (1) they can't hold as many records, and (2) aren't distributed across physical locations in the same way as Cassandra.

Since you're new to Cassandra, have a look at If you scroll down the page, you'll see there are free hands-on tutorials which only take a few minutes to complete. The Cassandra Fundamentals course is a good starting point since you can quickly learn key concepts. These tutorials are interactive and run in your browser so you don't have anything to install or configure. Cheers!

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.