lziegler avatar image
lziegler asked lziegler edited

Spark output to NAS returns mkdirs failure on _temporary directories

My development team experiences mkdirs authorization failures when writing out csv or parquet from spark to a NAS drive.

Caused by: Mkdirs failed to create file:/mycompany/testcase/tmp/quick/pushdatazzz/_temporary/0/_temporary/attempt_20191118173538_0003_m_000000_9 (exists=false, cwd=file:/apps/cassandra/data/data2/spark/rdd/app-20191118173446-0234/0)
    at org.apache.hadoop.fs.ChecksumFileSystem.create(
    at org.apache.hadoop.fs.ChecksumFileSystem.create(
    at org.apache.hadoop.fs.FileSystem.create(
    at org.apache.hadoop.fs.FileSystem.create(
    at org.apache.hadoop.fs.FileSystem.create(
    at org.apache.spark.sql.execution.datasources.CodecStreams$.createOutputStream    (CodecStreams.scala:81)
    at org.apache.spark.sql.execution.datasources.CodecStreams$.createOutputStreamWriter(CodecStreams.scala:92)
    at org.apache.spark.sql.execution.datasources.csv.CsvOutputWriter.<init>(CSVFileFormat.scala:135)
    at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anon$1.newInstance(CSVFileFormat.scala:77)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:303)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:312)

[fakeuser@fakenode quick]$ ls -al
total 36
drwxrwxrwt 8 fakeuser fakegroup 4096 Nov 18 17:35 .
drwxrwxrwx 4 500 500 8192 Nov 18 15:56 ..
drwxr-xr-x 2 fakeuser fakegroup 4096 Nov 18 17:35 pushdatazzz

No subdirectories are created whatsoever.

We would expect spark to create temporary directories from the nodes where the query executed and coalesce them into one file. If we did not coalesce, we would expect multiple files in this directory.

I read miscellaneous narratives searching for answers that mentioned

--conf spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2 ... but this was of no help.

Has anyone else faced this dilemna? Are we expected to write to dsefs first and then get the file to local?

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Russell Spitzer avatar image
Russell Spitzer answered

The default operation of this is to create a target/_temporary directory. This is how hadoop implemented file-writers which are used by Spark. After they are finished being written to the temporary directory they are moved into the actual target.

Although I don't have the full stack trace for your error, I'm guessing this is an Executor exception. If this is the case it may be that the Executor Process has a different user than your job submitter, and that user doesn't have permissions to write to the target directory. By default in a DSE Analytics cluster, the executors are launched by the DSE user so check those permissions.

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.