I would like to create a Cassandra setup for a multi data-center system.
Each data center should operate "locally", with the following requirements:
1/ In case of a complete site failure, traffic will be directed to the other (healthy) site and the application nodes will access Cassandra nodes on that site (read/write).
2/ In case of a network partition (network disconnection), each site should continue to work using its own Cassandra nodes. Later - after network connectivity is restored, the expectation is for data in both sites to be sync-ed automatically.
The application is operated without any data-center affinity, i.e., requests for a specific key can arrive at any of the sites virtually "in the same time" (to be more precise, data update to site B can happen before an update for site A has synced with site B). (Sometimes this is referred to as "Active/Active").
The application state data is stored in a blob (per each key/record). There is no way to "merge" two "generations" of this data.
a/ If we use consistency-level EACH_QUORUM – it seems that both requirements (1) and (2) are not fulfilled, as if a site fails, or is disconnected, any query will fail as the EACH_QUORUM will not be satisfied.
Using EACH_QUORUM has also a performance implications penalty, as each (at least) write operation will have to cross a long geographical distance. This performance penalty cannot be mitigated for this application.
b/ If we use LOCAL_QUORUM – requirement (1) will be satisfied.
However, in case of a network disconnection between sites, and assuming same keys are updated on both sites during the disconnection period, we believe that there will be DATA LOSS when the network connection is restored and the Cassandra attempts to synchronize the data.
Per our understanding – the ‘state data’ with the most recent timestamp will “win”, and updates which have been done to the blob data with the less-recent timestamp will be lost ("Last Write Win" paradigm).
c/ Similar to (b), and even when network connection between sites is functioning - the same data loss phenomena may happen also when two updates to the same key happen at the "same time" (i.e., before update from other sites have been exchanged - using LOCAL_QUORUM)
Is this understanding correct?
If so – is there a proposed way to set the cluster such that both requirements ( (1) and (2) ) are fulfilled? How?
Thanks very much,