I am reading a cassandra table in spark and running count on the spark data -frame . My spark job has created 292 tasks ,it succeeds fast for 290 tasks but for rest 2 it runs very longer . I am under impression that when partitions in spark are created most of the partitions going to be of same size since number of partitions are dependent on data size / input spit size in mb , so there should not be case of data skewness , but Is it possible if cassandra has very big partition in that case dta skewness might be created in spark partitions ?