As reported here STATIC set cover finder is giving worse query performance than the DYNAMIC one in a cluster with 24 nodes, 4 nodes and replication factor 3.
Why is it so? Should STATIC be the recommended one?
As reported here STATIC set cover finder is giving worse query performance than the DYNAMIC one in a cluster with 24 nodes, 4 nodes and replication factor 3.
Why is it so? Should STATIC be the recommended one?
The issue was originally reported by @landonvg_125049
Let me first explain the observed behavior and then follow with some recommendations:
Cassandra data distribution when vnodes are used is somewhat random and the behavior of the routing algorithm (STATIC/DYNAMIC) depends on the actual data distribution.
We've run tests on 100 clusters with 24 nodes, 4 vnodes, and rf 3 to check the algorithm behavior.
Our tests confirm that for such a config there is a tradeoff between DYNAMIC and STATIC variants. DYNAMIC version exerts 5 times more filter cache pressure than the STATIC one, but it often allows better load balancing (that's a statistical effect, it really depends on your data distribution).
As long as filter cache doesn't thrash the DYNAMIC variant may work better than STATIC in your case.
Should you run into filter cache thrashing STATIC with inertia 2 or 4 should be a viable option to try.
Please note that 4 vnodes is not a recommended configuration. The reason is that it brings the worst of both worlds: it introduces certain difficulties (one of which you run into), while it doesn't provide substantial space balancing benefits. As a consequence, we haven't performed an extensive performance evaluation of STATIC set cover finder for such configurations and it should not have been recommended to you. I'm sorry for all the trouble you encountered because of that.
For search workloads, it is best when you do not use vnodes. In such a case, STATIC (with the default inertia=1) provides both perfect load balancing and perfect filter cache utilization.
If you do need vnodes though we recommend having 8 of them. With 8 vnodes it is very unlikely (but unfortunately still possible - a thing to improve perhaps) that STATIC works worse than DYNAMIC. Normally it should provide both better load balancing and filter cache usage.
Should you want to try 8 vnodes in a cluster with 24 nodes I'd recommend STATIC with inertia set to 2, 3 or 4, depending on what works best in your environment.
3 People are following this question.
How to get back the facets when passing facet.limit=-1
How do we index dynamic MAP column (contains, contains key) without allow filtering
How can we add dsetool indexing in CI/CD configuration?
What impact do repairs and compactions have on DSE Search indexes?
How do I setup DataStax Graph and Search workload in the same DC?
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2023 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
Privacy Policy Terms of Use