Schniko avatar image
Schniko asked Schniko commented

Issue in BasicLoadBalancingPolicy when local DC isn't up


I noticed an issue in the method BasicLoadBalancingPolicy.maybeAddDcFailover of the java driver 4.13.0

Usecase: 2 datacenters -> dc1 (local), dc2(remote). In case my app starts while the local dc1 isn't up, liveNodes.dcs will only have dc2. This means that only 1 dc is detected but it isn't the local one.

Issue: the current code considers that if only 1 datacenter is detected, it won't be considered among the remote ones (BasicLoadBalancingPolicy line 326 -> dcs.length <= 1). In my case, no node from dc2 will be considered as remote.

Proposed solution: remove the nb of dcs check -> it will be done naturally through the loop just after. And BasicSize of remoteNodes should be dcs.length * maxNodesPerRemote. If a node is from the local dc, it will naturally by skipped. The list size will then be truncated to keep only the filled remote nodes (trimmedRemoteNodes at line 346)

Thus the code for the QueryPlan remote (line 325) would be as follow

Object[] dcs = liveNodes.dcs().toArray();
Object[] remoteNodes = new Object[dcs.length * maxNodesPerRemoteDc];

Instead of

Object[] dcs = liveNodes.dcs().toArray();
if (dcs.length <= 1) {
  return EMPTY_NODES;
Object[] remoteNodes = new Object[(dcs.length - 1) * maxNodesPerRemoteDc];
int remoteNodesLength = 0;

Could this be considered in some future version of the driver ?

Thx !

java driver
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Schniko commented

In our opinion, it doesn't make sense to do so but if you feel strongly about it then you are more than welcome to submit a pull request.

Cross-DC failover was previously removed in version 4.0 and 3.7 of the Java driver (JAVA-2041) because we don't believe that the decision for switching to a remote DC should be done at the driver level. We strongly believe that failover should be done at the infrastructure level such that app instances in a remote DC should take over if the Cassandra nodes in the local DC is unavailable because allowing remote notes for local consistency levels breaks the guarantee that operations are done in the local DC.

This is a very important point that we don't think developers understand completely. And due to demand from the community, cross-DC failover was added back in version 4.10 (JAVA-2899). However we continue to educate our customers and the community on what we think is the right way for handling partial or complete DC outages. Cheers!

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Schniko avatar image Schniko commented ·


Thx for your reply. I followed your advice and created the following PR :
I also completely understand your point. However it doesn't seem to match the usecase in my workplace -> due to audit reasons, we are required to have 2 (local) datacenter available (in case 1 goes up in fire, or is under maintenance). But in reality, they are both 'local'... That's why in my case, my app needs to be able to connect to one or the other without having to change its configuration.

0 Likes 0 ·