pranali.khanna101994_189965 avatar image
pranali.khanna101994_189965 asked Erick Ramirez answered

How does data stay consistent if a batch fails midway through?

despite having a strong retry mechanism, what if a batch fails in the mid ? the data which is upserted into the tables will now be inconsistent and as cassandra does not support rollback what will happen in this case ?

Also, all batch operations are atomic right? whether they involve multi partition or single partition that is either the batch is passed or fail . so how consistency is maintained without rollback

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

CQL BATCH statements are guaranteed to (a) execute ALL statements, or (b) none of the statements. There isn't a situation where only some of the statements get applied.

Cassandra is able to guarantee that all statements in the batch get applied by "logging the batch" before it tries to apply the statements. The batch gets "logged" in the system.batches table on two randomly picked nodes.

IF the batch is logged successfully, C* applies all the statements in the batch. If for whatever reason some of the batch statements are unsuccessful, C* replays the batchlog until all statements has been applied successfully. This is how C* can guarantee the all the statements are applied in (a) above.

IF the batch is not logged successfully (failed writes on two nodes), the CQL BATCH is marked as failed and none of the statements get applied as per (b) above.

Using the algorithm I described you should be able to see that there isn't a scenario where a batch is only partially applied so there is no case for a rollback mechanism. Cheers!

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.