question

manish.c.ghildiyal_170766 avatar image
manish.c.ghildiyal_170766 asked Erick Ramirez answered

A query about a section of the book Cassandra: A Definitive Guide

In book 'Definitive Cassandra', I came across this:

Dynamo and Cassandra choose to be always writable, opting to defer the complexity of reconciliation to read operations, and realize tremendous performance gains. The alternative is to reject updates amidst network and server failures.

What does it exactly mean?

performance
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

In contrast to other databases where updates are written to database files, Cassandra's SSTables (data files) are immutable -- they never get updated again once they've been written to disk.

Cassandra saves mutations to memtables plus append the mutations to a commitlog on disk (to guard against things like power failures). This makes writes in Cassandra very, VERY fast. The data does not get written to disk immediately. The data in the memtables get flushed only after some thresholds are reached. For details, see How data is written in Cassandra.

The tradeoff is that data can be fragmented between the memtable and the various SSTables on disk. When data is read, Cassandra needs to retrieve all the fragments and reconcile/coalesce them on-heap before it can return the result. For details, see How data is read in Cassandra. Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.