Jeoppman avatar image
Jeoppman asked Erick Ramirez answered

PySpark on Databricks returns IOException: "Could not initialize class com.datastax.oss.driver.internal.core.config.typesafe.TypesafeDriverConfig"


I installed following 4 libraries from Maven into my databricks cluster:


Now this code results to a connection error:

from pyspark.sql import SparkSession

spark = SparkSession.builder \
  .appName('SparkCassandraApp') \
  .config('', '') \
  .config('spark.cassandra.connection.port', '9042') \
  .config("spark.cassandra.auth.password","$$$$") \

table = 'ts_kv_partitions_cf'
keyspace = 'thingsboard'
df ="org.apache.spark.sql.cassandra").load(keyspace="thingsboard", table="ts_kv_partitions_cf")

Error message: Failed to open native connection to Cassandra at {} :: Could not initialize class com.datastax.oss.driver.internal.core.config.typesafe.TypesafeDriverConfig

If I try to connect to Cassandra from the same cluster via "cassandra-driver", I can connect and retrieve data without any problem.

from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider,AuthProvider
from cassandra.util import uuid_from_time
import pandas as pd
import uuid

contactPoint = [""]  # öffentliche IP der Cassandra VM DEV
port = 9042

username = dbutils.secrets.get('Thingsboard_DEV','cassandra-admin-username')
password = dbutils.secrets.get('Thingsboard_DEV','cassandra-admin-password')

auth_provider = PlainTextAuthProvider(username=username, password=password)
cluster = Cluster(contactPoint, port=port, auth_provider=auth_provider)
session = cluster.connect("thingsboard")

Any idea what´s missing? Thanks for a hint.

Best, Jens

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

As I stated in my answer in #12321, you only need these libraries:

libraryDependencies += "com.datastax.spark" % "spark-cassandra-connector_2.12" % "3.0.1"

Have a look at the Quick Start Guide for details.

You should also load the Cassandra extensions so your session builder should look something like:

  .withExtensions(new CassandraSparkExtensions)

Have a look at the Datasets and PySpark with DataFrames pages for examples. Cheers!

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.