Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

chandrasekar.b03_190734 avatar image
chandrasekar.b03_190734 asked ·

What is Paxos Consensus in Cassandra LWT?

Hi,

The information quoted below is from Datastax Documentation (see Linearizable Consistency under the topic Linearizable Consistency)

"Lightweight transaction write operations use the serial consistency level for Paxos consensus and the regular consistency level for the write to the table".

So what is Paxos consensus in C* LWT ? Why LWT has both serial as well as regular Consistency levels as specified by the docs ?

Kindly let me know the answers in your available time :))

lightweight transactions
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered ·

There are situations where updates to the database must be performed in sequence and cannot be interrupted until it has completed. This is usually not a problem for relational databases since records are locked by the database when a process is writing to it.

A typical example is when a user registers for a new account. Registering a new username is problematic when there are hundreds or thousands of sessions running concurrently.

Usernames are unique and can only be allocated once. To allocate a username, the following events must be performed in sequence (serial):

  1. Check that the username has not already been registered (read).
  2. If (1) is true, allocate the username (write).

For this sequence of events to succeed, it must be isolated from all other requests such that another user session cannot update the table between the check (step 1) and allocation (step 2) above. It needs to be linearizable (in programming terms) when multiple processes (user sessions) are performing the same events concurrently.

Cassandra achieves this isolation and linearizable (serial) consistency using the Paxos protocol. Paxos achieves consensus in distributed environments using a quorum-based algorithm.

The serial consistency levels are used during the Paxos phase of lightweight transactions, that is the read-before-write when it checks for the condition (for example, if a username does not exists). The consistency level for the reads in the Paxos phase are different from the normal consistency level because they need to prevent updates to the data to guarantee that the LWT condition is true for the lifetime of the operation. If the LWT condition is true then the write is performed using the normal consistency level specified.

See Jonathan Ellis' blog post on Lightweight transactions in Cassandra for a detailed description of the Paxos protocol algorithm used in Cassandra. Cheers!

1 comment Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi Erick,

Understood the whole scenario with a relevant example in real time. Thanks for your great support all the way!! :)))

0 Likes 0 · ·
smadhavan avatar image
smadhavan answered ·

@chandrasekar.b03_190734, thank you for your question about Paxos consensus.

Regarding your statement,

Why LWT has both serial as well as regular Consistency levels as specified by the docs?

LWT/CAS transactions supports only SERIAL or LOCAL_SERIAL consistency levels. Refer to how SERIAL consistency level is configured for managing lightweight transactions documentation for additional details.

For explaining about the Paxos consensus, I am going to be directing you to the below resources/materials which has already explained those concepts clearly,

I do hope these resources help you in understanding about the topic!

1 comment Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi @smadhavan,

Thanks for letting me know more about those topics with the resources you provided. :)))

0 Likes 0 · ·