question

What does DSBulk count?

Hi,
Suppose we have a wide cluster. DC1 and Dc2, each has five servers.
So 10 together, RF between a Table is three; 3.

Now, I want to count the rows in BIGTABLE.
Great: dsbuil count -k myts t- bigtable -h <ip_dc1_node1>

I get 500,000 as a response.

And finally the question: Is 500,000 the number of rows on that node (1 node), or number of rows on the total Cluster (10 nodes)?

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

The count operation will compute the number of rows in the table bigtable and keyspace myts

As per your requirement let's define the keyspace :

```CREATE KEYSPACE IF NOT EXISTS myts
WITH REPLICATION =
{ 'class' : 'NetworkTopologyStrategy',
'DC1' : 3,
'DC2' : 3, }```

With 500.000 total rows for your table. You would have :

• 1.500.000 rows in DC1 (RF=3) but we cannot tell how much on each node, about 300.000 because you stated 5 nodes in DC1
• 1.500.000 rows in DC2 (RF=3) but we cannot tell how much on each node, about 300.000 because you stated 5 nodes in DC2.
• 3.000.000 records in the whole cluster (records in DC1 + records in DC2)

DSBulk Documentation:https://docs.datastax.com/en/dsbulk/doc/dsbulk/reference/dsbulkCmd.html

Share

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

@kajarvine_115939 that result is the count for the table. The host you provided in the command line is just the initial contact point so DSBulk can connect to the cluster. In then runs the query against the table to get the result. Cheers!

Share

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.