I am using the spark-cassandra-connector_2.12 v3.2.0 with Spark v3.3.0 with Cassandra v3.11.11 (also tried v3.11.13) with Azul Zulu Java 1.8.0_312 and with Scala v2.12.15 all on Windows.
I get the following exception when doing a simple spark-shell.cmd "spark.sql("SELECT * FROM lcc.cw_junk.junk").show". I watched Russell Spitzer's "DataSource V2 and Cassandra" video and it made it sound so easy. What am I doing wrong?
Output from spark-shell.cmd (note: excluded packages are because they fail to download):
spark-shell.cmd --packages com.datastax.spark:spark-cassandra-connector_2.12:3.2.0 --conf spark.cassandra.connection.host=127.0.0.1 --conf spark.cassandra.auth.username=myusername --conf spark.cassandra.auth.password=mypassword --conf spark.sql.extensions=com.datastax.spark.connector.CassandraSparkExtensions --conf spark.sql.catalog.lcc=com.datastax.spark.connector.datasource.CassandraCatalog --conf spark.sql.defaultCatalog=lcc --exclude-packages com.thoughtworks.paranamer:paranamer,com.google.code.findbugs:jsr305 :: loading settings :: url = jar:file:/C:/s330/jars/ivy-2.5.0.jar!/org/apache/ivy/core/settings/ivysettings.xml Ivy Default Cache set to: C:\Users\myusername\.ivy2\cache The jars for the packages stored in: C:\Users\myusername\.ivy2\jars com.datastax.spark#spark-cassandra-connector_2.12 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent-26e2827a-0592-475e-97ad-e4d2ab4b8408;1.0 confs: [default] found com.datastax.spark#spark-cassandra-connector_2.12;3.2.0 in central found com.datastax.spark#spark-cassandra-connector-driver_2.12;3.2.0 in central found com.datastax.oss#java-driver-core-shaded;4.13.0 in central found com.datastax.oss#native-protocol;1.5.0 in central found com.datastax.oss#java-driver-shaded-guava;25.1-jre-graal-sub-1 in central found com.typesafe#config;1.4.1 in central found org.slf4j#slf4j-api;1.7.26 in spark-list found io.dropwizard.metrics#metrics-core;4.1.18 in central found org.hdrhistogram#HdrHistogram;2.1.12 in central found org.reactivestreams#reactive-streams;1.0.3 in central found com.github.stephenc.jcip#jcip-annotations;1.0-1 in central found com.github.spotbugs#spotbugs-annotations;3.1.12 in central found com.datastax.oss#java-driver-mapper-runtime;4.13.0 in central found com.datastax.oss#java-driver-query-builder;4.13.0 in central found org.apache.commons#commons-lang3;3.10 in central found org.scala-lang#scala-reflect;2.12.11 in central :: resolution report :: resolve 683ms :: artifacts dl 52ms :: modules in use: com.datastax.oss#java-driver-core-shaded;4.13.0 from central in [default] com.datastax.oss#java-driver-mapper-runtime;4.13.0 from central in [default] com.datastax.oss#java-driver-query-builder;4.13.0 from central in [default] com.datastax.oss#java-driver-shaded-guava;25.1-jre-graal-sub-1 from central in [default] com.datastax.oss#native-protocol;1.5.0 from central in [default] com.datastax.spark#spark-cassandra-connector-driver_2.12;3.2.0 from central in [default] com.datastax.spark#spark-cassandra-connector_2.12;3.2.0 from central in [default] com.github.spotbugs#spotbugs-annotations;3.1.12 from central in [default] com.github.stephenc.jcip#jcip-annotations;1.0-1 from central in [default] com.typesafe#config;1.4.1 from central in [default] io.dropwizard.metrics#metrics-core;4.1.18 from central in [default] org.apache.commons#commons-lang3;3.10 from central in [default] org.hdrhistogram#HdrHistogram;2.1.12 from central in [default] org.reactivestreams#reactive-streams;1.0.3 from central in [default] org.scala-lang#scala-reflect;2.12.11 from central in [default] org.slf4j#slf4j-api;1.7.26 from spark-list in [default] --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 16 | 0 | 0 | 0 || 16 | 0 | --------------------------------------------------------------------- :: retrieving :: org.apache.spark#spark-submit-parent-26e2827a-0592-475e-97ad-e4d2ab4b8408 confs: [default] 0 artifacts copied, 16 already retrieved (0kB/29ms) 22/06/23 10:44:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at http://myhost.mydomain.com:4040 Spark context available as 'sc' (master = local[*], app id = local-1655999101732). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.3.0 /_/ Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 1.8.0_312) Type in expressions to have them evaluated. Type :help for more information. scala> spark.sql("describe table lcc.cw_junk.junk").show 22/06/23 10:45:11 WARN ProcfsMetricsGetter: Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped +--------------+---------+-------+ | col_name|data_type|comment| +--------------+---------+-------+ | junk1| string| | | junk2| string| | | junk3| bigint| | | junk4| bigint| | | | | | |# Partitioning| | | | Part 0| junk1| | +--------------+---------+-------+ scala> spark.sql("SELECT * FROM lcc.cw_junk.junk").show java.lang.IllegalArgumentException: Unsupported data source V2 partitioning type: CassandraPartitioning at org.apache.spark.sql.execution.datasources.v2.V2ScanPartitioning$$anonfun$apply$1.applyOrElse(V2ScanPartitioning.scala:46) at org.apache.spark.sql.execution.datasources.v2.V2ScanPartitioning$$anonfun$apply$1.applyOrElse(V2ScanPartitioning.scala:34) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:584) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:584) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$3(TreeNode.scala:589) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1228) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1227) at org.apache.spark.sql.catalyst.plans.logical.OrderPreservingUnaryNode.mapChildren(LogicalPlan.scala:208) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:589) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$3(TreeNode.scala:589) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1228) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1227) at org.apache.spark.sql.catalyst.plans.logical.OrderPreservingUnaryNode.mapChildren(LogicalPlan.scala:208) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:589) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$3(TreeNode.scala:589) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1228) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1227) at org.apache.spark.sql.catalyst.plans.logical.OrderPreservingUnaryNode.mapChildren(LogicalPlan.scala:208) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:589) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:560) at org.apache.spark.sql.execution.datasources.v2.V2ScanPartitioning$.apply(V2ScanPartitioning.scala:34) at org.apache.spark.sql.execution.datasources.v2.V2ScanPartitioning$.apply(V2ScanPartitioning.scala:33) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200) at scala.collection.immutable.List.foreach(List.scala:431) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179) at org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:126) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184) at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:122) at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:118) at org.apache.spark.sql.execution.QueryExecution.assertOptimized(QueryExecution.scala:136) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:154) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:151) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:106) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3856) at org.apache.spark.sql.Dataset.head(Dataset.scala:2863) at org.apache.spark.sql.Dataset.take(Dataset.scala:3084) at org.apache.spark.sql.Dataset.getRows(Dataset.scala:288) at org.apache.spark.sql.Dataset.showString(Dataset.scala:327) at org.apache.spark.sql.Dataset.show(Dataset.scala:808) at org.apache.spark.sql.Dataset.show(Dataset.scala:767) at org.apache.spark.sql.Dataset.show(Dataset.scala:776) ... 47 elided scala>