I was running Spark 3.0.1 on Kubernetes and fetching data from Cassandra using spark Cassandra connector.
Use case is Spark JavaRDD<Key> is generating local Partitions by calling repartitioningWithCassandraReplicas with X Table and then using joinWithCassandraTable with X Table. This thing is working on Spark StandAlone where Spark and Cassandra both are on same server and Spark Partitions localized after repartitioningWithCassandraReplicas before calling joinWithCassandraTable. But the same thing if tried on Kubernetes where Spark and Cassandra running in separate Pod,
It seems repartitionByCassandraReplica failed as no data locality obtained in Spark Container.
What am I missing here to make it work. How the network shuffling between spark and cassandra can be minimized?