Hi,
I have a unique case where I have to process a lot of data from a single non-cluster Cassandra instance and since we're talking about a non-cluster mode, If I can have a more efficient way to do the batch process by loading sstables to spark workers, using hadoop-sstable library instead of using the connector, which I assume also open connections to the database.
thoughts?
by the way, anyone tried using https://github.com/fullcontact/hadoop-sstable with spark