Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

boris_187128 avatar image
boris_187128 asked ·

Trying to connect to AWS Cassandra with datastax.spark.connector without success

Hi Everyone, happy to be part of the community.

I'm trying to read Cassandra to Spark DF. here is the code: (using pyspark from AWS EMR for testing)


import boto3
s3 = boto3.client('s3', aws_access_key_id='$$$', aws_secret_access_key='$$$$')
s3.download_file('bucket','cassandra_truststore.jks','cassandra_truststore.jks')

spark = SparkSession.builder \
  .appName('SparkCassandraApp') \
  .config('spark.cassandra.connection.host', 'cassandra.us-east-1.amazonaws.com') \
  .config('spark.cassandra.connection.port', '9142') \
  .config('spark.cassandra.connection.ssl.enabled','true') \
  .config('spark.cassandra.connection.ssl.trustStore.path','cassandra_truststore.jks') \
  .config('spark.cassandra.connection.ssl.trustStore.password','amazon') \
  .config("spark.cassandra.auth.username","$$$$$")\
  .config("spark.cassandra.auth.password","$$$$$") \
  .getOrCreate()

df = sqlContext.read.format("org.apache.spark.sql.cassandra").options(table, keyspace).load()

The error is:

py4j.protocol.Py4JJavaError: An error occurred while calling o120.load.
: java.lang.IllegalArgumentException: Unsupported partitioner: local
at com.datastax.spark.connector.rdd.partitioner.dht.TokenFactory$.forCassandraPartitioner(TokenFactory.scala:92)

How can I resolve this? stuck for 2 days.

Thanks a lot for the help

sparkconnector
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

@boris_187128 I've tried to look for documentation on how the partitioner is configured in MCS but there doesn't seem to be public information about it. The error you posted indicates that MCS uses a partitioner called local:

java.lang.IllegalArgumentException: Unsupported partitioner: local
    at com.datastax.spark.connector.rdd.partitioner.dht.TokenFactory$.forCassandraPartitioner(TokenFactory.scala:92)

I can however confirm that the only 2 partitioners supported by the spark-cassandra-connector (from TokenFactory.scala) are:

  • Murmur3TokenFactory
  • RandomPartitionerTokenFactory

For more information, see Cassandra Partitioners. This means that MCS isn't supported by the connector at this stage. I've logged SPARKC-587 on your behalf.

In the meantime, I recommend you try DataStax Astra -- a cloud-native service built with the best distribution of Apache Cassandra. You can try it for FREE. Cheers!

4 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi! Yes, trying to connect to MCS. Thank so much for the help.

0 Likes 0 · ·

@boris_187128 Welcome and thanks for being a part of the DataStax Community. A friendly note I've converted your post into a comment since it is not an "answer". Cheers!

0 Likes 0 · ·

Got it! thanks! will us DynamoDB for now.

0 Likes 0 · ·

Not a problem. Good luck!

0 Likes 0 · ·