My request is for information that describes the Apache Cassandra "Bloom Filter structure" and where the "Bloom Filter structure" resides in an Apache Cassandra configuration.
Bringing together the Apache Cassandra experts from the community and DataStax.
Want to learn? Have a question? Want to share your expertise? You are in the right place!
Not sure where to begin? Getting Started
Hi @Erick Ramirez. I know maybe it is not the place but instead of opening a new discussion I would like to point something about Bloom filters. In the learning path, on DSE201 and on step 'Read path' it is asked on the quiz "which of the following structures reside on disk? Check all that apply" and where the choices are: key cache, sstable, bloom filter, partition summary, partition index", but in the correct answer you need to select also bloom filter. Bloom filters reside on disk but they are put into memory (thanks for telling us it is on off-heap) and the reads use this in-memory bloom filter and not the one that reside on disk, they exist to just no recalculate them on every node restart (C* instance, I mean). So, I think the question is ambiguous as maybe it could be something like "whick of the following structures reside on disk for read purposes?", although I think that is not a good idea as could be confusing... but I think it is also confusing to say that bloom filters reside on memory when they are also stored on disk but not used for any client read. Thanks and good day.
@rb2017houser_184626 Each SSTable file has an associated bloom filter component file with the suffix
*-Filter.db. When an SSTable is opened at startup, the bloom filter data structure is held in off-heap memory and gets read as part of the Cassandra read path.
Each table's bloom filter size (in memory) is configurable by setting the bloom_filter_fp_chance property in the table's schema.
For more info, see the following documents:
7 People are following this question.