Sakiv avatar image
Sakiv asked Sakiv commented

How does Cassandra ensure integrity of data in commitlog and SSTables?

Had seen similar question on another thread and felt that may for for in-flight data. So posting afresh.

How does Cassandra ensure integrity of Stored data, say Commit Log, SSTables? How can tampering of the same be identified?

Under compression we have crc_check_chance.. it defaults to 1, does that ensure integrity of data is checked whenever data is read?

What is stored in Digest.crc32?

Appreciate if you can point me to some documentation on this.

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

starlord avatar image
starlord answered Erick Ramirez edited

The importance of repair in a distributed system should not be understated. It's one of the main methods for ensuring data integrity in combination with your Replication Factor which increases durability.

Assuming you have a Replication Factor greater than 1 (3 is most common), repair will run comparisons and identify inconsistencies as well as corruption. When corruption is identified, manual intervention is typically required - typically the data is scrubbed or the corrupt sstable(s) is/are removed and repair can then stream data from another replica.

Other common ways corruption is identified is through reads and compactions. Using a stronger Consistency Level will read more replicas to ensure consistent data, but if corruption is reported, again, manual intervention is required to repair the corruption. If corruption is found during the compaction process, it will fail and the same manual intervention will be required.

You are right about crc_check_chance ensuring data integrity of reads, and left at the default of 1, all reads are checked.

In fact, most people using Cassandra do utilize compression, so for the majority, data integrity is verified in these compressed sstables by writing a checksum to the Digest sstable file, then the verification process is just iterating the Data.db file and comparing checksums.

The Digest.crc32 file is actually just an alder32 checksum of the Data.db file of the same sstable set:

Hopefully this helps, but let us know if there are additional questions.

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered Sakiv commented

Like most modern systems, Cassandra implements several layers of error-checking:

  • messaging layer - checksum verification to confirm that the payload from the client is valid
  • storage layer - SSTable component *-CRC.db holds the CRC32 for chunks in uncompressed files
  • data layer - the metadata in the partition header matches the data in the underlying partition and rows

In question #7876, I've explained how the native protocol contains metadata that allows the senders and receivers to easily validate that the contents of the message are valid.

I've also explained there that with a replication factor of 3 in each DC, reading with a strong consistency reduces the likelihood that multiple replicas have corrupted data. In the very remote chance that this happens to you, you have bigger problems than Cassandra -- it means that you have a catastrophic infrastructure issue affecting multiple servers/hardware and looking for a software solution isn't going to fix your problems. Cheers!

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Sakiv avatar image Sakiv commented ·

@Erick Ramirez @starlord Appreciate you both taking time and explaining the same. Thought I had thanked you folks earlier but guess I missed. Thanks again! Your guidance here is appreciated.

0 Likes 0 ·