question

rht.rajput_193125 avatar image
rht.rajput_193125 asked eddy.wong_186388 answered

Why am I getting warning messages on COUNT() of videos_by_actor in DS220 denormalization exercise?

After importing csv file into killrvideo.videos_by_actor table.. i tried counting import count by select count(*) .. It gave me result but with some messages. Can someone please explain why did i see these messages. Also, saw some upsert.

Following is my table structure. Can someone please help on what mistakes i made on choosing primary key.

CREATE TABLE killrvideo.videos_by_actor (
    actor text,
    added_date timestamp,
    video_id timeuuid,
    character_name text,
    description text,
    encoding frozen<video_encoding>,
    tags set<text>,
    title text,
    user_id uuid,
    PRIMARY KEY (actor, added_date, video_id)
) WITH CLUSTERING ORDER BY (added_date DESC, video_id ASC)
cqlsh:killrvideo> COPY videos_by_actor (actor, added_date, video_id, character_name, description, encoding, tags, title, user_id) FROM 'videos_by_actor.csv' WITH HEADER = true;
Using 1 child processes
Starting copy of killrvideo.videos_by_actor with columns [actor, added_date, video_id, character_name, description, encoding, tags, title, user_id].
Processed: 81659 rows; Rate: 701 rows/s; Avg. rate: 1475 rows/s
81659 rows imported from 1 files in 55.377 seconds (0 skipped).
cqlsh:killrvideo> select count(*) from videos_by_actor ;

count
-------
81574

(1 rows)
Warnings :
Aggregation query used without partition key
Read 7489 live rows and 7489 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Douglas Bristow) LIMIT 7489 (see tombstone_warn_threshold)
Read 7489 live rows and 7489 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Peter Jason) LIMIT 7489 (see tombstone_warn_threshold)
Read 7489 live rows and 7489 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Ingeborga Dapkūnaitė) LIMIT 7489 (see tombstone_warn_threshold)
Read 7489 live rows and 7489 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Christopher Stone) LIMIT 7489 (see tombstone_warn_threshold)
Read 7489 live rows and 7489 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Françoise Yip) LIMIT 7489 (see tombstone_warn_threshold)
Read 7489 live rows and 7489 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Emilio Estevez) LIMIT 7489 (see tombstone_warn_threshold)
Read 7489 live rows and 7489 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Alberto Viruena) LIMIT 7489 (see tombstone_warn_threshold)
Read 7489 live rows and 7489 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Peter Crook) LIMIT 7489 (see tombstone_warn_threshold)
Read 7489 live rows and 7489 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Aasif Mandvi) LIMIT 7489 (see tombstone_warn_threshold)
Read 7489 live rows and 7489 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Richard Jordan) LIMIT 7489 (see tombstone_warn_threshold)
Read 6584 live rows and 6584 tombstone cells for query SELECT * FROM killrvideo.videos_by_actor WHERE token(actor) >= token(Maura Tierney) LIMIT 7489 (see tombstone_warn_threshold)


academyds220count
3 comments
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

bettina.swynnerton avatar image bettina.swynnerton ♦♦ commented ·

Hi @rht.rajput_193125,

I've not been able to reproduce this, even with frequent runs of the copy command. Are you getting this on Astra? Or on the VM? Did you load the data several times?

Thanks!

0 Likes 0 ·
sud.shekhar_193255 avatar image sud.shekhar_193255 bettina.swynnerton ♦♦ commented ·

Hi @bettina.swynnerton, I am also getting this error while doing exercise on VM. Is it because of UDT data types in table?

0 Likes 0 ·
bettina.swynnerton avatar image bettina.swynnerton ♦♦ sud.shekhar_193255 commented ·

As I have not been able to reproduce this, would you let me have more context: Is this on Astra? Did you load the data several times?

Thanks!

0 Likes 0 ·
Erick Ramirez avatar image
Erick Ramirez answered

There isn't enough information in your question to diagnose the problem. We've made several attempts replicate the issue you reported but in all instances, both the (a) data load and (b) COUNT() complete successfully without any issues.

For there to be tombstones scanned, at some point you must have deleted some or all the data. In order to diagnose this issue, please provide the following:

  • information about the cluster you're connecting to (e.g. Astra, DS220 VM, self-built cluster)
  • a copy of the cqlsh history file (located in the home directory of the OS user in ~/.cassandra/cqlsh_history)

You can otherwise workaround the problem by running a TRUNCATE on the videos_by_actor table.

If we don't hear back from you, we are going to assume you manage to workaround the problem. Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

eddy.wong_186388 avatar image
eddy.wong_186388 answered

I've seen a similar error before on the videos table. This makes it go away momentarily:


alter table killrvideo.videos_by_actor with gc_grace_seconds=0;


This forces a compaction, which gets rid of the tombstones. I'm wondering if the data itself causes the tombstones.

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.