First of all, I do know there's a restrict of not to mix LWT and non LWT operation on Cassandra.
From my observation in our application, one of the reason for such restriction is:
Since java driver 3.0, normal insertion will use a timestamp generated from client side, but LWT insertion will use the timestamp from server side, and Cassandra uses a last-write-win strategy.
I'm aware of the performance impaction of using an LWT (4 round trip / paxos / etc...), but our case is we put our DC level distributed lock on Cassandra.
So when try to acquire the lock, we use a LWT insertion, but to speed up the lock performance, we use a normal deletion when releasing the lock.
Then we're facing the data corruption caused by mixing usage of LWT and non LWT operation.
Which is, our deletion success, but with an earlier timestamp so it doesn't take effect.
Then our first fix is to run a LOCAL_QUORUM query with writetime() function to retrieve the write timestamp, add 1 milli second to it, and use "USING TIMESTAMP" to set it when deletion.
Then we realized it still doesn't work, because the timestamp retrieved with LOCAL_QUORUM seems not the final write time for the data inserted by LWT. Still, we process a deletion with an earlier timestamp.
So actually I have 3 questions:
- Dose the data inserted by LWT has different timestamps in different replicas, which actually generated from Cassandra nodes during 3rd step of LWT paxos (propose / accept)?
- Dose a query with consistency level LOCAL_QUORUM to the data inserted by LWT considers the response writetime the latest one from its ACKs? For example, 3 replicas inserted by LWT have 3 different timestamps, and a LOCAL_QUORUM query retrieves 2 of them and uses the latest timestamp of these 2 as the write time of the response?
- If we have to insist doing so (insert by LWT then normal delete), can we use the LOCAL_SERIAL consistency level and writetime() function to retrieve the timestamp, and use it as the timestamp for normal deletion to make sure the deletion works?
Or, is the only choice for us is to use both LWT insertion and LWT deletion for our user lock or abandon our distributed lock on Cassandra?
CREATE TABLE "sample"."distributed_lock" ( lock_id uuid, owner uuid, PRIMARY KEY (lock_id) );
The way acquiring lock with LWT, CL = LOCAL_SERIAL
CONSISTENCY LOCAL_SERIAL; INSERT INTO "sample"."distributed_lock" (lock_id, owner) VALUES(fake-uuid-1, fake-uuid-2) IF NOT EXISTS;
The previous way releasing lock without LWT, CL = LOCAL_QUORUM. We will use
CONSISTENCY LOCAL_QUORUM; SELECT WRITETIME(owner), lock_id, owner FROM "sample"."distributed_lock"; DELETE FROM "sample"."distributed_lock" WHERE lock_id = "fake-uuid-fetched-above" USING TIMESTAMP "write-time-fetched-above";
Then deletion doesn't take effect, moreover, the writetime retrieved by SELECT is different from the writetime retrieved after a while.
So if we change the consistency level in the last step, from LOCAL_QUORUM to LOCAL_SERIAL. Will it work in any cases?
Any discussion is welcomed and thanks in advance ~