Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

jordanp98 avatar image
jordanp98 asked Erick Ramirez answered

Spark executor getting "Failed to open native connection to Cassandra"

Hi ,
I am connecting and extracting data from source cluster and copying same to destination cluster with same topology using spark it works perfectly but sometime it gives below error.

if everything is right then it should work always this is very inconsistent issue for me.
Could you please suggest if i need to do any configuration on cluster or spark to fix this issue.

At the same time when i got this error i am able to connect node 192.168.100.51}:9042 from same system where spark job is running so i think this is not related to network issue .

anything other which need to be check here as per suggestion ?

18:43:10 [Spark Context Cleaner] INFO  org.apache.spark.ContextCleaner  - Cleaned accumulator 0
18:43:10 [Executor task launch worker for task 24] INFO  com.datastax.driver.core.ClockFactory  - Using native clock to generate timestamps.
18:43:17 [pool-17-thread-1] INFO  com.datastax.spark.connector.cql.CassandraConnector  - Disconnected from Cassandra cluster: devcass3
18:43:17 [Executor task launch worker for task 11] INFO  com.datastax.driver.core.ClockFactory  - Using native clock to generate timestamps.
18:43:17 [Executor task launch worker for task 24] ERROR org.apache.spark.executor.Executor  - Exception in task 24.0 in stage 0.0 (TID 24)
18:43:17 java.io.IOException: Failed to open native connection to Cassandra at {192.168.100.51}:9042
18:43:17     at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:168)
18:43:17     at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$8.apply(CassandraConnector.scala:154)
18:43:17     at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$8.apply(CassandraConnector.scala:154)
18:43:17     at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:32)
18:43:17     at com.datastax.spark.connector.cql.RefCountedCache.syncAcquire(RefCountedCache.scala:69)
18:43:17     at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:57)
18:43:17     at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:79)
18:43:17     at com.datastax.spark.connector.cql.DefaultScanner.<init>(Scanner.scala:27)
18:43:17     at com.datastax.spark.connector.cql.CassandraConnectionFactory$class.getScanner(CassandraConnectionFactory.scala:30)
18:43:17     at com.datastax.spark.connector.cql.DefaultConnectionFactory$.getScanner(CassandraConnectionFactory.scala:35)
18:43:17     at com.datastax.spark.connector.rdd.CassandraTableScanRDD.compute(CassandraTableScanRDD.scala:361)
18:43:17     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
18:43:17     at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
18:43:17     at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
18:43:17     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
18:43:17     at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
18:43:17     at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
18:43:17     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
18:43:17     at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
18:43:17     at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
18:43:17     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
18:43:17     at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
18:43:17     at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
18:43:17     at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
18:43:17     at org.apache.spark.scheduler.Task.run(Task.scala:123)
18:43:17     at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
18:43:17     at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
18:43:17     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
18:43:17     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
18:43:17     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
18:43:17     at java.lang.Thread.run(Thread.java:748)
18:43:17 Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: 192.168.100.51/192.168.100.51:9042 (com.datastax.driver.core.exceptions.TransportException: [192.168.100.51/192.168.100.51:9042] Cannot connect))
18:43:17     at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:233)
18:43:17     at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:79)
18:43:17     at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1483)
18:43:17     at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:399)
spark-cassandra-connector
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

The Spark connector uses the Java driver under the hood to connect to Cassandra. The driver returns this NoHostAvailableException if (a) there is no live host in the cluster, or (b) all hosts which have been tried failed.

In the stack trace you posted above, the failure was due to a TransportException:

Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: 192.168.100.51/192.168.100.51:9042 (com.datastax.driver.core.exceptions.TransportException: [192.168.100.51/192.168.100.51:9042] Cannot connect))

A TransportException is returned by the driver when the node it is connecting to is down or unresponsive.

You will need to check the system.log yourself on that node for clues on the root cause. In my experience, it is almost always a result of the node being overloaded making it unresponsive or appear down to clients. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.