question

Luff avatar image
Luff asked Erick Ramirez answered

Does spark-cassandra-connector support INSERT INTO for SparkSql?

Hi,

Recently, when I use SparkSql to manipulate Cassandra, I've got a little confusion.

Here is what I did:

First, I created two cassandra table through SparkSql:

  • create table cassandra.testks.testtab1(colA int, colB text) using cassandra partitioned by(colA)
  • create table cassandra.testks.testtab2(colA int, colB text) using cassandra partitioned by(colA)

Then, I tried to INSERT INTO testtab1 by "insert into testks.testtab1(colA, colB) values(1, 'a')", which throws an Exception "missing primary key columns: [colA]". However, the following SQL works fine :

  • insert into testks.testtab1(colA, colB) select colA, colB from testks.testtab2

I found that in CassandraWriteBuilder(https://github.com/datastax/spark-cassandra-connector/blob/master/connector/src/main/scala/com/datastax/spark/connector/datasource/CassandraWriteBuilder.scala), there exists column comparison between "primaryKeyColumn" and "inputColumns".

I'm not sure whether it's a bug or just my wrong usage.

Environment: Spark 3.1.2、spark-cassandra-connector_2.12-3.1.0.

Thanks in advance!

spark-cassandra-connector
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

Thanks for bringing this to our attention. It doesn't look like it is supported by the Spark connector.

I've logged a feature request on your behalf to have it implemented (SPARKC-691). Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.