question

DhavalBhatt avatar image
DhavalBhatt asked Erick Ramirez commented

Slow Spark job execution impacting overall experience

we have a 9 node spark cluster that is performing a background job with the help of Spark Job Server UI and Jenkins.

since last couple of days, I am observing significant time to complete the Spark job. However, I am sure we are not observing in new data load to the system. or not made any changes in job or anything.

One thing I would like to mention and for that I need help. most of the job which are taking longer time to complete is depend on table that has higher amount of tombstone.

  • Average live cells per slice (last five minutes): 9.260071466283065
  • Maximum live cells per slice (last five minutes): 5001.0
  • Average tombstones per slice (last five minutes): 1.3884200136199125
  • Maximum tombstones per slice (last five minutes): 9795.0

I know that having higher amount of tombstone may cause read issue while perroming background operation. version of DSE is quite old 4.8.* and the table contains DateTier Compation so I am not sure how can I bring down this tombstone as mannual compation or read reapir is not good with datetier (Correct me If I am wrong here)

dsespark-cassandra-connectorspark
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

You have a complex problem and I'll need a lot more diagnostic information to be able to help you.

I'm limited in the assistance I can give in a Q&A forum and you are best served logging a ticket with DataStax Support so one of our engineers can assist you directly. Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

DhavalBhatt avatar image DhavalBhatt commented ·

Thanks, I have access to that portal however need to discuss this with my team also have a minor update regarding tombstone information. I copied it from one node only but I execute the same query on more nodes and found that

Max tombstone range is: 22000 to 39500

The average tombstone range is : 4.## to 12.##

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ DhavalBhatt commented ·

Unfortunately, it's impossible for me to comment without diagnostic data. Cheers!

0 Likes 0 ·