We have spark job that runs on skylake and cascade machines. The executor configuration on both the cluster are same (same memory and core for executors).However in skylake cluster the jobs run fine and completes early, whil in cascade lake the job take 2 or 3 times higher time to complete.
My cluster is a standalone spark cluster running on spark 2.4.4 (not yarn or mesos as resource manager), there are 3 slaves, each running on each node/machine. There is one primary master and one standby secondary master.
When i see the logs in cascade machine, i get see lots of errors on debug
com.datastax.driver.core.exceptions.BusyPoolException: [host3_priv/] Pool is busy (no available connection and the queue has reached its max size 256)
com.datastax.driver.core.exceptions.BusyPoolException: [host1_priv/] Pool is busy (no available connection and the queue has reached its max size 256)
com.datastax.driver.core.exceptions.BusyPoolException: [host2_priv/] Pool is busy (no available connection and the queue has reached its max size 256)
My Spark cassandra connector version is 2.4.2.
I saw the post https://community.datastax.com/questions/3749/cassandra-pool-is-busy-no-available-connection-and.html talking about the problem which suggests 2 options a and b.
With a version , fix is available for 2.0.4 and above as mentioned in https://datastax-oss.atlassian.net/browse/SPARKC-503?page=com.atlassian.streams.streams-jira-plugin%3Aactivity-stream-issue-tab
But this issue is there ,so any solution to this problem.