Hi,
The attached file shows the range reads when reading a table from Cassandra. It shows that 1 Cassandra node is being requested many more reads than other nodes. It is always the same Cassandra node (VPC-CASSANDRA-005). When it's done, there are 0 reads from that node until the task ends.
All other nodes read requests are balanced. Is that a connector bug, or something external, e.g. configuration?
Environment:
- Amazon Linux AMI 2018.03
- java version "1.8.0_131"
- spark-core_2.12-3.1.1.jar
- spark-cassandra-connector_2.12-3.1.0.jar
- Cassandra 3.1.1 [cqlsh 5.0.1 | DSE 5.1.22 | CQL spec 3.4.4 | DSE protocol v1]
- 36 Cassandra servers
- Spark runs on a remote server - 12 instances - 5 cores each
Thanks,
Shai