Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

guzelcihad avatar image
guzelcihad asked ·

Why does an INSERT update a row where the clustering column has a null value?

Hi,

I got stucked with an example. Let me share it.

Assume that I've a table like this:

CREATE TABLE countries (   
country text,
total_pop int STATIC,
state text,
state_pop int,
PRIMARY KEY (country, state) );

Then I add a row like this then select it:

INSERT INTO countries (country, total_pop) VALUES ('USA', 111);
 country | state | total_pop | state_pop
---------+-------+-----------+-----------
     USA |  null |       111 |      null

First time, I didn't understand this because I dont think clustering key column can be null. But then thought that, that may be OK.

So, I add another row like this then select it:

INSERT INTO countries (country, state ) VALUES ('USA', 'Oregon');
 country | state  | total_pop | state_pop
---------+--------+-----------+-----------
     USA | Oregon |       111 |      null

This is the where I got stucked with it.

country and state provides uniqueness.

But why second query updates the first record in the table? Those are different records.

I was waiting for a result like this:

 country | state  | total_pop | state_pop
---------+--------+-----------+-----------
     USA |  null |       111 |      null
     USA | Oregon |       111 |      null

Can you explain it to me?

Thanks

clustering key
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

The "second query" does not update the "first record" in the table. The mistake here is that you came to the conclusion that a record exists in the table after your initial INSERT.

This statement does not create a row:

cqlsh> INSERT INTO countries (country, total_pop) VALUES ('USA', 111);

because you cannot have a NULL value for any columns in the PRIMARY KEY (see Defining a partition key with clustering columns):

A NULL value cannot be inserted into a PRIMARY KEY column. This restriction applies to both partition keys and clustering columns.

You cannot query rows in a partition when the clustering column is not set. All it does is create a partition where total_pop = 111 (a static column) without any rows in the partition. Up until this point, you have a partition without any rows in it.

When you run the second statement, it creates a new row for Oregon with the same total population of 111 since it is a STATIC column. It does not update an older record in the table because no rows existed until you inserted Oregon.

If you do successive inserts:

cqlsh> INSERT INTO countries (country, state ) VALUES ('USA', 'Texas');
cqlsh> INSERT INTO countries (country, state ) VALUES ('USA', 'Ohio');

it will create additional rows in the table just like when a new row was created for Oregon:

cqlsh> SELECT * FROM countries ;

 country | state  | total_pop | state_pop
---------+--------+-----------+-----------
     USA |   Ohio |       111 |      null
     USA | Oregon |       111 |      null
     USA |  Texas |       111 |      null

Cheers!

2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks for explanation.

The trick was here is the existence of the static column it seems.

0 Likes 0 ·

Yes, I agree. Cheers!

1 Like 1 ·