jakub.zytka_141356 avatar image
jakub.zytka_141356 asked jakub.zytka_141356 edited

behaviour of STATIC set cover finder in clusters with 4 vnodes

As reported here STATIC set cover finder is giving worse query performance than the DYNAMIC one in a cluster with 24 nodes, 4 nodes and replication factor 3.

Why is it so? Should STATIC be the recommended one?

searchvnodesdynamicset cover finder
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

jakub.zytka_141356 avatar image
jakub.zytka_141356 answered jakub.zytka_141356 edited

The issue was originally reported by @landonvg_125049

Let me first explain the observed behavior and then follow with some recommendations:

Cassandra data distribution when vnodes are used is somewhat random and the behavior of the routing algorithm (STATIC/DYNAMIC) depends on the actual data distribution.

We've run tests on 100 clusters with 24 nodes, 4 vnodes, and rf 3 to check the algorithm behavior.

Our tests confirm that for such a config there is a tradeoff between DYNAMIC and STATIC variants. DYNAMIC version exerts 5 times more filter cache pressure than the STATIC one, but it often allows better load balancing (that's a statistical effect, it really depends on your data distribution).
As long as filter cache doesn't thrash the DYNAMIC variant may work better than STATIC in your case.

Should you run into filter cache thrashing STATIC with inertia 2 or 4 should be a viable option to try.

Please note that 4 vnodes is not a recommended configuration. The reason is that it brings the worst of both worlds: it introduces certain difficulties (one of which you run into), while it doesn't provide substantial space balancing benefits. As a consequence, we haven't performed an extensive performance evaluation of STATIC set cover finder for such configurations and it should not have been recommended to you. I'm sorry for all the trouble you encountered because of that.

For search workloads, it is best when you do not use vnodes. In such a case, STATIC (with the default inertia=1) provides both perfect load balancing and perfect filter cache utilization.

If you do need vnodes though we recommend having 8 of them. With 8 vnodes it is very unlikely (but unfortunately still possible - a thing to improve perhaps) that STATIC works worse than DYNAMIC. Normally it should provide both better load balancing and filter cache usage.

Should you want to try 8 vnodes in a cluster with 24 nodes I'd recommend STATIC with inertia set to 2, 3 or 4, depending on what works best in your environment.

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.