Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

set1984 avatar image
set1984 asked ·

Why does failover not get triggered when there is only one node alive in the local DC?

I set up two data centers (DC1, DC2) with 3 nodes each. The local-datacenter is DC1. The micro-service doesn't failover to DC2 automatically when there is only one node alive in the local data center, instead, it keeps complaining "Not enough replicas available for a query at consistency LOCAL_QUORUM (2 required but only 1 alive". Any advice will be appreciated.


Maven dependencies:

org.springframework.boot:spring-boot:jar:2.3.7.RELEASE
org.springframework.boot:spring-boot-starter-data-cassandra:jar:2.3.7.RELEASE
org.springframework.data:spring-data-cassandra:jar:3.0.6.RELEASE
com.datastax.oss:java-driver-core:4.10.0

YAML:

spring:
  data:
    cassandra:
      consistency-level: local_quorum
      serial-consistency-level: LOCAL_SERIAL
      local-datacenter: DC1
      other configs: ...

Code Snippet :

public class CassandraConfig {

    @Bean
    public CqlSessionBuilderCustomizer sessionBuilderConfigurer() {
        return cqlSessionBuilder ->
                cqlSessionBuilder
                        .withAuthCredentials("username", "pwd");
    }

    @Bean
    public DriverConfigLoaderBuilderCustomizer driverConfigLoaderBuilderCustomizer() {
        return loaderBuilder -> loaderBuilder
                .withDuration(DefaultDriverOption.REQUEST_TIMEOUT, Duration.ofMillis(10000))
                .withBoolean(DefaultDriverOption.LOAD_BALANCING_DC_FAILOVER_ALLOW_FOR_LOCAL_CONSISTENCY_LEVELS, true)
                .withInt(DefaultDriverOption.LOAD_BALANCING_DC_FAILOVER_MAX_NODES_PER_REMOTE_DC, 3);
    }
}
java driver
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

This is the expected behaviour for the driver. As long as there are nodes available in the local DC, the driver will continue to communicate with that DC.

It is only when there are no nodes available in the local DC will the driver attempt to contact nodes in the remote DC.

When remote DCs are allowed by the load-balancing policy, the query plan (list of nodes to contact) will contain nodes in the local DC first with nodes in the remote DC added to the end. This means that the nodes in the local DC will be contacted first. Remote nodes will only be contacted when the list of local nodes have been exhausted.

For details, see the Java driver Load balancing doc. Cheers!

2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi Erick, thank you for sharing your insights. Does that mean I need to create a custom RetryPolicy and overwrite onUnavailable method, so that the driver will move to the remote nodes in the query plan after the only alive local node is unable to handle the write/read request?

0 Likes 0 ·

Will the local_quorum config prevent the driver from moving to remote DC nodes?

For example, both local and remote DCs have 3 nodes in each. Two local nodes are down. I assume the query plan looks like the list below

#1: Local_Node_1 (UP)

#2: Remote_Node_1 (UP)

#3: Remote_Node_2 (UP)

#4: Remote_Node_3 (UP)

#5: Local_Node_2 (DOWN/IGNORED)

#6: Local_Node_3 (DOWN/IGNORED)


Will the driver move to #2 successfully or get stuck at #1 as consistency-level is local_quorum and there are no enough local replicas?

0 Likes 0 ·