What are snapshots and why are they used in Cassandra?

I'm using open source Cassandra on windows CMD. I have created two tables.

one table with some rows inserted and another table is empty.

Now when I go to data/data/keyspacename/tablename/..

I only see a folder called backups.

But when I dropped the two tables, I see one table folder having a folder snapshot with a file schema.cql. (include the image file)

The other table folder having a snapshot folder with lot of files with extensions (.db, .cql, .json) (include image file)

Actually what is happening in my keyspace? What is this so called snapshot doing on my keyspace ?

Also, What is the difference between snapshot and hinted handoff?

1 Answer

Hi @chandrasekar.b03_190734,

If you look into the configuration file cassandra.yaml, you will find a section about auto-snapshots:

# Whether or not a snapshot is taken of the data before keyspace truncation
# or dropping of column families. The STRONGLY advised default of true 
# should be used to provide data safety. If you set this flag to false, you will
# lose data on truncation or drop.
auto_snapshot: true

By default this is set to true.

It is a safety feature, in case you did not want to drop the table. A snapshot is bascially a backup of your schema and data files at the time you dropped. If you dropped by mistake, you can restore your table and data from the snapshot.

Without the auto-snapshot, a drop or truncate irreversibly deletes your data.

The answer in this question has good info how snapshots work.

The many files with the varying extensions are your sstable files. These are your data files. The manifest.json contains the information about the snapshot itself.

I hope this helps!

