what is the best and most accurate way to get record count in a Cassandra table with multi node Apache Cassandra version 3.11.6 cluster with replication factor of 3 ,
Bringing together the Apache Cassandra experts from the community and DataStax.
Want to learn? Have a question? Want to share your expertise? You are in the right place!
Not sure where to begin? Getting Started
Performing a CQL
COUNT() has always been problematic in Cassandra not because it isn't capable but more a challenge inherent in its distributed architecture. I've written about this problem in detail in a blog post, Counting keys? Might as well be counting stars.
Luckily, we now have the DataStax Bulk Loader (
dsbulk tool) to the rescue. Primarily designed as a more efficient tool for bulk loading data in CSV or JSON format to a Cassandra cluster, the Bulk Loader is also features the ability to perform a distributed count of records in a table.
Here are the key references on the Bulk Loader tool:
6 People are following this question.