question

srinu.gajjala321_68185 avatar image
srinu.gajjala321_68185 asked Erick Ramirez edited

Jobs are failing with java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:275) when writing from one key space to another

Hi,


I'm trying a load data from one table to another one in different key space using external spark cluster. the job failed with the below errors:


java.lang.IllegalArgumentException
    at java.nio.Buffer.limit(Buffer.java:275)
    at com.datastax.driver.core.CodecUtils.readBytes(CodecUtils.java:153)
    at com.datastax.driver.core.TypeCodec$AbstractUDTCodec.deserialize(TypeCodec.java:2141)
    at com.datastax.driver.core.AbstractGettableByIndexData.get(AbstractGettableByIndexData.java:383)
    at com.datastax.driver.core.AbstractGettableData.get(AbstractGettableData.java:26)
    at com.datastax.spark.connector.GettableData$.get(GettableData.scala:103)
    at com.datastax.spark.connector.CassandraRow$.dataFromJavaDriverRow(CassandraRow.scala:190)
    at org.apache.spark.sql.cassandra.CassandraSQLRow$.fromJavaDriverRow(CassandraSQLRow.scala:51)
    at org.apache.spark.sql.cassandra.CassandraSQLRow$CassandraSQLRowReader$.read(CassandraSQLRow.scala:58)
    at org.apache.spark.sql.cassandra.CassandraSQLRow$CassandraSQLRowReader$.read(CassandraSQLRow.scala:55)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$15.apply(CassandraTableScanRDD.scala:345)
    at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$15.apply(CassandraTableScanRDD.scala:345)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
    at scala.collection.Iterator$$anon$12.next(Iterator.scala:444)
    at com.datastax.spark.connector.util.CountingIterator.next(CountingIterator.scala:16)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at com.datastax.spark.connector.util.CountingIterator.hasNext(CountingIterator.scala:12)
    at com.datastax.spark.connector.writer.GroupingBatchBuilder.hasNext(GroupingBatchBuilder.scala:101)
    at scala.collection.Iterator$class.foreach(Iterator.scala:893)
    at com.datastax.spark.connector.writer.GroupingBatchBuilder.foreach(GroupingBatchBuilder.scala:31)
    at com.datastax.spark.connector.writer.TableWriter$$anonfun$writeInternal$1.apply(TableWriter.scala:233)
    at com.datastax.spark.connector.writer.TableWriter$$anonfun$writeInternal$1.apply(TableWriter.scala:210)
    at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:112)
    at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:111)
    at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:145)
    at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:111)
    at com.datastax.spark.connector.writer.TableWriter.writeInternal(TableWriter.scala:210)
    at com.datastax.spark.connector.writer.TableWriter.insert(TableWriter.scala:197)
    at com.datastax.spark.connector.writer.TableWriter.write(TableWriter.scala:183)
    at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
    at com.datastax.spark.connector.RDDFunctions$$anonfun$saveToCassandra$1.apply(RDDFunctions.scala:36)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

We have few UDT's in cassandra and the same has been present in the new table which we are moving data to..

Any help would be appreciated and please let me know if you need more information is needed.

cc. @Russell Spitzer

Thanks .

spark-cassandra-connector
1 comment
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

markc avatar image markc commented ·

Could you add your table schema, and a example of how you're copying the data in spark?

0 Likes 0 ·

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

@srinu.gajjala321_68185 The stack trace you provided indicates that the job is failing when reading data from one of the tables. In my experience, the most common cause of this type of read failure is when the metadata doesn't match the order of the columns on disk which leads to the IllegalArgumentException since the type codec is not what was expected.

If this turns out to be the cause, I recommend reviewing the application queries and explicitly specifying the column names instead of using SELECT * FROM ... to make sure that you get the expected columns. Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.