Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

Dharun avatar image
Dharun asked Erick Ramirez answered

Spark job returns "java.io.IOException: mkdir of dsefs://172.18.0.2/tmp/streaming/checkpoint/aQuery/state/0/1 failed"

I'm submitting a spark job in docker container (dse 6.8.15), i have mounted /var/lib/dsefs with my host's path.

WARN  2021-10-28 07:29:28,957 org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 9.0 (TID 30, 172.18.0.2, executor 1): java.io.IOException: mkdir of dsefs://172.18.0.2/tmp/streaming/checkpoint/aQuery/state/0/1 failed
    at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:1065)
    at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:161)
    at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:730)
    at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:726)
    at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
    at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:733)
    at org.apache.spark.sql.execution.streaming.FileContextBasedCheckpointFileManager.mkdirs(CheckpointFileManager.scala:305)
    at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.init(HDFSBackedStateStoreProvider.scala:224)
    at org.apache.spark.sql.execution.streaming.state.StateStoreProvider$.createAndInit(StateStore.scala:230)
    at org.apache.spark.sql.execution.streaming.state.StateStore$$anonfun$2.apply(StateStore.scala:365)
    at org.apache.spark.sql.execution.streaming.state.StateStore$$anonfun$2.apply(StateStore.scala:365)
    at scala.collection.mutable.HashMap.getOrElseUpdate(HashMap.scala:79)
    at org.apache.spark.sql.execution.streaming.state.StateStore$.get(StateStore.scala:363)
    at org.apache.spark.sql.execution.streaming.state.StateStoreRDD.compute(StateStoreRDD.scala:88)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
    at org.apache.spark.scheduler.Task.run(Task.scala:121)
    at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
dsefs
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

The exception is a low-level filesystem issue where the process is not able to create a directory in /tmp. In my experience this is usually a permissions issue where the OS user doesn't have rights to the filesystem.

I'm limited in the assistance I can provide in a Q&A forum so I recommend that you log a ticket with DataStax Support so one our engineers can assist you. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.