Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

guzelcihad avatar image
guzelcihad asked Erick Ramirez commented

Why does an INSERT update a row where the clustering column has a null value?

Hi,

I got stucked with an example. Let me share it.

Assume that I've a table like this:

CREATE TABLE countries (   
country text,
total_pop int STATIC,
state text,
state_pop int,
PRIMARY KEY (country, state) );

Then I add a row like this then select it:

INSERT INTO countries (country, total_pop) VALUES ('USA', 111);
 country | state | total_pop | state_pop
---------+-------+-----------+-----------
     USA |  null |       111 |      null

First time, I didn't understand this because I dont think clustering key column can be null. But then thought that, that may be OK.

So, I add another row like this then select it:

INSERT INTO countries (country, state ) VALUES ('USA', 'Oregon');
 country | state  | total_pop | state_pop
---------+--------+-----------+-----------
     USA | Oregon |       111 |      null

This is the where I got stucked with it.

country and state provides uniqueness.

But why second query updates the first record in the table? Those are different records.

I was waiting for a result like this:

 country | state  | total_pop | state_pop
---------+--------+-----------+-----------
     USA |  null |       111 |      null
     USA | Oregon |       111 |      null

Can you explain it to me?

Thanks

clustering key
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

The "second query" does not update the "first record" in the table. The mistake here is that you came to the conclusion that a record exists in the table after your initial INSERT.

This statement does not create a row:

cqlsh> INSERT INTO countries (country, total_pop) VALUES ('USA', 111);

because you cannot have a NULL value for any columns in the PRIMARY KEY (see Defining a partition key with clustering columns):

A NULL value cannot be inserted into a PRIMARY KEY column. This restriction applies to both partition keys and clustering columns.

You cannot query rows in a partition when the clustering column is not set. All it does is create a partition where total_pop = 111 (a static column) without any rows in the partition. Up until this point, you have a partition without any rows in it.

When you run the second statement, it creates a new row for Oregon with the same total population of 111 since it is a STATIC column. It does not update an older record in the table because no rows existed until you inserted Oregon.

If you do successive inserts:

cqlsh> INSERT INTO countries (country, state ) VALUES ('USA', 'Texas');
cqlsh> INSERT INTO countries (country, state ) VALUES ('USA', 'Ohio');

it will create additional rows in the table just like when a new row was created for Oregon:

cqlsh> SELECT * FROM countries ;

 country | state  | total_pop | state_pop
---------+--------+-----------+-----------
     USA |   Ohio |       111 |      null
     USA | Oregon |       111 |      null
     USA |  Texas |       111 |      null

Cheers!

2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks for explanation.

The trick was here is the existence of the static column it seems.

0 Likes 0 ·

Yes, I agree. Cheers!

1 Like 1 ·