I have a 4 node cluster (2 tables) in which rf is 3. The cluster has been inserted 50k (test data , actual size will be high) records. Since rf is 3, ideally there should a total of 50k *3 .
If i go to the data directory of cassandra in one of the nodes ( out of the 4)and go to the respective tables folder, i can see the sstable files. There is only one data.db file. Are the replica data that the node is responsible for is also stored in the same data.db file?. I mean are the replicas stored in other data.db files?
I am currently in the scenario in which i need to take a backup snapshot of the this 4 node cluster. Inorder to reduce disk usage i just want to take backup of the original data meaning i want to avoid taking backup of replicas ( its just reduntant right)
Is snapshot tool capable of doing this? Or is it manually possibly to achieve this? If the replica data is stored in a different data.db file , i could manually remove
any help is appreciated