question

s.decristofaro_187478 avatar image
s.decristofaro_187478 asked vikram.singh.chouhan_187371 answered

Are queries like SELECT COUNT(*) and SELECT * with no filter inefficient?

Hi All,

i have a question about these two queries:

SELECT * FROM table;

and

SELECT COUNT(*) FROM table;

Following the Cassandra data modeling they should scan the whole cassandra space, but i noticed that no error is thrown and no ALLOW FILTERING is required.

So my question is: are they inefficient? or is there an exception for these two queries?

Thanks.

cassandracount
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered vikram.singh.chouhan_187371 commented

@s.decristofaro_187478 Yes, you are correct. Both of those queries are expensive.

Unbounded SELECT statements (reads without a WHERE clause filter) are inefficient because it is a full table scan and will attempt to retrieve all records from the database. The same goes for SELECT COUNT().

It may appear to work when you tested it but that's likely because you were querying a very small table with less than a few thousand records. If you were doing this in a production environment where you have tens or hundreds of thousands of records, the query will timeout before it completes.

If you are interested, I've explained it in detail in this blog post. Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

s.decristofaro_187478 avatar image s.decristofaro_187478 commented ·

Thank you very much for your answer Erick

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ s.decristofaro_187478 commented ·

Not a problem. Cheers!

0 Likes 0 ·
vikram.singh.chouhan_187371 avatar image
vikram.singh.chouhan_187371 answered

@s.decristofaro_187478. Just to add another thing apart from @Erick Ramirez answer, you raised in your question: "no ALLOW FILTERING is required" for these queries.
ALLOW FILTERING is only required when your search is either on a non-primary column or primary key columns without specifying its previous primary columns if any in the order defined during table creation. And using ALLOW FILTERING doesn't always mean it will be inefficient, it actually depends on how much disk your query is going to scan in the end.

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.