question

rohansurana2810_190538 avatar image
rohansurana2810_190538 asked bettina.swynnerton answered

Dse Cluster Casting Exception

Hi,

I am using the DseGraphFrame to update my graph. But I am getting a ClassCastException.

I have shown the dependencies, the error message, the code snippet where I am getting this error.

Pom.xml dependencies:

        <dependency>
            <groupId>com.datastax.dse</groupId>
            <artifactId>dse-external-spark-api</artifactId>
            <version>6.8.0</version>
        </dependency>
        <dependency>
            <groupId>com.datastax.oss</groupId>
            <artifactId>java-driver-core</artifactId>
            <version>4.7.2</version>
        </dependency>

The dse-external-spark-api is generated referring to BYOS.

ref: https://docs.datastax.com/en/dse/6.8/dse-dev/datastax_enterprise/spark/byosSparkShell.html


Code snippet where the error occurs:

DseGraphFrame gf = DseGraphFrameBuilder.dseGraph("graphname", sparkSession);

Exception:

java.lang.ClassCastException: com.datastax.driver.core.Cluster cannot be cast to com.datastax.driver.dse.DseCluster


Slack trace:

at com.datastax.bdp.graph.spark.graphframe.DseGraphFrameBuilder$$anonfun$withDataStoreDo$1.apply(DseGraphFrameBuilder.scala:184)
at com.datastax.bdp.graph.spark.graphframe.DseGraphFrameBuilder$$anonfun$withDataStoreDo$1.apply(DseGraphFrameBuilder.scala:183)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withClusterDo$1.apply(CassandraConnector.scala:136)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withClusterDo$1.apply(CassandraConnector.scala:135)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:115)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:114)
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:158)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:114)
at com.datastax.spark.connector.cql.CassandraConnector.withClusterDo(CassandraConnector.scala:135)
at com.datastax.bdp.graph.spark.graphframe.DseGraphFrameBuilder$.withDataStoreDo(DseGraphFrameBuilder.scala:182)
at com.datastax.bdp.graph.spark.graphframe.DseGraphFrameBuilder$.schema(DseGraphFrameBuilder.scala:131)
at com.datastax.bdp.graph.spark.graphframe.DseGraphFrameBuilder$.schemaKeyspace(DseGraphFrameBuilder.scala:143)
at com.datastax.bdp.graph.spark.graphframe.DseGraphFrameBuilder$.apply(DseGraphFrameBuilder.scala:118)
at com.datastax.bdp.graph.spark.graphframe.DseGraphFrameBuilder$.dseGraph(DseGraphFrameBuilder.scala:214)
at com.datastax.bdp.graph.spark.graphframe.DseGraphFrameBuilder.dseGraph(DseGraphFrameBuilder.scala)


Please let me know the possible fix for this.

Thanks!

dsegraphspark
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered bettina.swynnerton commented

Hi,

for a short term workaround, please try with this driver version, this one has the com.datastax.driver.dse.DseCluster class

<dependency>
  <groupId>com.datastax.dse</groupId>
  <artifactId>dse-java-driver-core</artifactId>
  <version>1.9.0</version>
</dependency>

The cluster handling changed with the newer unified drivers. I will investigate this further what this means for java spark jobs through DSE BYOS, but let us know this works with the older driver.

Note that for graph queries through script API or fluent API, this driver version will fail due to protocol differences when accessing DSE Graph 6.8.

Let me know how you get on with the different dependency.

Thanks!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

michaelvssolis_39462 avatar image michaelvssolis_39462 commented ·

Hi,

I am also getting the exact error. I am using Scala/Spark + Gradle with these dependencies
https://repo.datastax.com/public-repos/com/datastax/dse/dse-spark-dependencies/6.8.0/

https://repo.datastax.com/public-repos/com/datastax/dse/dse-graph-frames/6.8.0/


The project already uses com.datastax.dse:dse-java-driver-core:1.9.0 but the ClassCastException is thrown when I try to load a dse graph from spark.


What other workarounds can I try?


Thanks!

0 Likes 0 ·
bettina.swynnerton avatar image bettina.swynnerton ♦♦ michaelvssolis_39462 commented ·

would you let me have your spark job code? Is it a java or a scala job? I have used graphframes with 6.8 without issues, so I would like to see what is different.

Thanks!

0 Likes 0 ·
bettina.swynnerton avatar image
bettina.swynnerton answered

As this issue was still not quite resolved, I came back to this problem and did some further tests, and I believe the DSE Cluster Casting issue is a result of incorrectly submitting the Spark job.

For reference, here is an example how to use DSE Graphframes in DSE 6.8.1 with external Spark, using the DSE BYOS jar.

My test setup:

DSE Graph 6.8.1

Apache Spark 2.4.6 running in a different cluster


I am building and packaging the Spark job in IntelliJ as an sbt project, using sbt 0.13.18

Here is the build.sbt file:

name := "graphframe68"

version := "0.1"

scalaVersion := "2.11.12"

resolvers += "DataStax Repo" at "https://repo.datastax.com/public-repos/"

val dseVersion = "6.8.1"

libraryDependencies += "com.datastax.dse" % "dse-spark-dependencies" % dseVersion % "provided" exclude(
  "org.slf4j", "log4j-over-slf4j")

Here is the scala job, GraphFrameTest.scala

import com.datastax.spark.connector._
import com.datastax.bdp.graph.spark.graphframe._
import org.apache.spark.sql.SparkSession

object GraphFrameTest extends App {

  val spark = SparkSession.builder
    .appName("Datastax Scala example")
    .enableHiveSupport()
    .getOrCreate()

  val g = spark.dseGraph("test")

  g.V().show()

}

Note: you need to build and package and then load to the Spark cluster where you want to submit the job. You cannot run this directly from IntelliJ.

To build and package:

sbt package

Before you can submit the job on the Spark node, make sure you have imported the dse BYOS jar from the DSE Graph cluster. You can find it on the DSE Graph cluster (in case of a package installation) here: /usr/share/dse/clients/dse-byos_2.11-6.8.1.jar

In my test, I copied this file to the SPARK_HOME directory on the Spark cluster master node.

In addition, you need to export the byos.properties from the DSE Graph cluster.

To get the byos.properties, run the following command on a DSE Graph cluster node:

dse client-tool configuration byos-export ~/byos.properties

Copy it to the Spark cluster. I placed it into the SPARK_HOME on the master node.

Then, to submit your job, you need to specify the DSE BYOS jar and the properties file, along with the class you want to run and the job jar file.

Here is the complete submit command:

$SPARK_HOME/bin/spark-submit --master spark://xxx.xxx.xxx.xxx:7077 --jars dse-byos_2.11-6.8.1.jar --properties-file byos.properties --class GraphFrameTest graphframe68_2.11-0.1.jar


I hope these steps help to understand how to use DSE Graphframes in a Spark job.


Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.