DataStax Academy FAQ

DataStax Academy migrated to a new learning management system (LMS) in July 2020. We are also moving to a new Cassandra Certification process so there are changes to exam bookings, voucher system and issuing of certificates.

Check out the Academy FAQ pages for answers to your questions:


question

shantanughar avatar image
shantanughar asked ·

Is there a way to track changes in Cassandra 3.7 for incremental data loading?

Hi, I'm very new to the ETL world and I wish to implement Incremental Data Loading with Cassandra 3.7 and Spark. I'm aware that later versions of Cassandra do support CDC, but I can only use Cassandra 3.7. Is there a method through which I can track the changed records only and use spark to load them, thereby performing incremental data loading?

If it can't be done on the cassandra end, any other suggestions are also welcome on the Spark side :)

cassandra
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

There is a Kafka sink connector which synchronises records from a Kafka topic and writes to Cassandra. For details, see the DataStax Apache Kafka Connector.

To be clear, this is a sink connector which means that Cassandra is not the source of the data but the destination.

There is no out-of-the-box solution available at this point which allows you to consume the mutations in the Change-Data-Capture (CDC) logs in Cassandra to use as a data source for another system. You will need to implement a custom solution to achieve this. I'm sorry that I don't have any examples you could reference.

As a side note, Cassandra 3.7 is not a supported version. You will either need to install the older C* 3.0 or 3.11 releases. Cassandra 3.7 was released 4 years ago (June 2016) and is no longer maintained. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.