question

mukheja.nitin avatar image
mukheja.nitin asked Erick Ramirez commented

Why can't DSE 6.8.0 nodes communicate on Azure with just listen_address and native_transport_address configured?

FOLLOW UP QUESTION TO #4355

In my environment i tried using same configuration but unfortunately its not working while adding second node. DSE 6.8.0 on ubuntu 18.04 Azure VM

Node 1:

listen_address : private_ip1
native_transport_address: public_ip1
Seeds: private_ip1

Node 2:

listen_address : private_ip2
native_transport_address: public_ip2
Seeds: private_ip1

firewall is stopped. Both nodes are not able to communicate to each other. If instead i use below config its working:

node1:

listen_address: private_ip1
rpc_address: 0.0.0.0
broadcast_address: public_ip1
broadcast_rpc_address: public_ip1
seeds: public_ip1

node2:

listen_address: private_ip2
rpc_address: 0.0.0.0
broadcast_address: public_ip2
broadcast_rpc_address: public_ip2
seeds: public_ip1
dseconfiguration
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

The symptoms you described indicate that you have a networking issue, not a DSE issue. If nodes are in the same region, they should be able to communicate with each other within the virtual network (vnet). But by default, if the nodes are not in the same region, you have to enable vnet peering on Azure.

UPDATE - Your first attempt wasn't working because you didn't configure an IP for native_transport_address:

INFO  [main] 2020-06-06 13:31:05,202  Config.java:684 - Node configuration:[ ... \
    listen_address=10.x.x.4; ... \
    native_transport_address=localhost; ... ]

In your following attempts, you did assign a public IP for native_transport_address:

INFO  [main] 2020-06-06 13:52:31,470  Config.java:684 - Node configuration:[ ... \
    listen_address=10.x.x.4; ... \
    native_transport_address=207.x.x.191; ... ]

however, 207.x.x.191 (obfuscated for privacy) does not appear to be a valid address. I say this because DSE isn't able to bind the CQL port to the public IP address you specified:

ERROR [DSE main thread] 2020-06-06 13:53:50,862  DseDaemon.java:562 - Unable to start DSE server.
java.lang.IllegalStateException: Failed to bind /207.x.x.191:9042.
    ...

Contrary to your point, the second node did not bring the first node down. The second node failed to start because it couldn't gossip with 10.x.x.4:

INFO  [main] 2020-06-06 14:03:23,405  DseModule.java:102 - Loading DSE module
...
INFO  [main] 2020-06-06 14:03:24,130  Config.java:684 - Node configuration:[ ... \
   listen_address=10.x.x.6; ... \
   native_transport_address=207.x.x.168; ... \
   {seeds=10.0.5.4}; ... ]
ERROR [DSE main thread] 2020-06-06 14:04:37,328  CassandraDaemon.java:901 - Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any peers
    ...

Node 10.x.x.6 couldn't gossip with 10.x.x.4 at 14:04 because it went down at 14:02:

ERROR [DSE main thread] 2020-06-06 14:02:30,559  CassandraDaemon.java:901 - Exception encountered during startup
java.lang.RuntimeException: java.lang.IllegalStateException: Failed to bind /207.x.x.191:9042.
    ...
INFO  [StorageServiceShutdownHook] 2020-06-06 14:02:30,574  DseDaemon.java:866 - DSE shutting down...
    ...
INFO  [StorageServiceShutdownHook] 2020-06-06 14:02:50,541  DseDaemon.java:945 - DSE shutdown complete.

You need to configure the correct public IP for node 10.x.x.4 before you can start it again. Cheers!

10 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

mukhejan_190041 avatar image mukhejan_190041 commented ·

logs.zip

When i changed the configuration to below it worked.

Node:1

listen_address: private_ip1
seeds: private_ip1
native_transport_address: localhost

Node: 2

listen_address: private_ip2
seeds: private_ip1
native_transport_address: localhost

But if i Change to below its not working:

Node:1

listen_address: private_ip1
seeds: private_ip1
native_transport_address: public_ip1

Node:2

listen_address: private_ip1
seeds: private_ip1
native_transport_address: public_ip1

My node 1 gets started but when i try to start node2 it brings my Node1 also down with attached logs.

Both VM are in same region and in default vnet/subnet

0 Likes 0 ·
logs.zip (350.7 KiB)
smadhavan avatar image smadhavan ♦ mukhejan_190041 commented ·

@mukhejan_190041, what's the output when you perform the following commands from node1?

  • curl -v telnet://node2_private_ip:7000

  • curl -v telnet://node2_public_ip:7000

0 Likes 0 ·
mukhejan_190041 avatar image mukhejan_190041 smadhavan ♦ commented ·

Above commands are failing since service is not running on 7000 port with the specified configuration.

When i change to previous configuration i.e native_transport_address: local host and then try to do telnet on port 7000 it works as service is running then

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ mukhejan_190041 commented ·

Looking through the data you just provided, it looks like you made several errors. Let me update my answer with an analysis of the logs. Cheers!

0 Likes 0 ·
mukhejan_190041 avatar image mukhejan_190041 Erick Ramirez ♦♦ commented ·

I created Datastax managed 3 nodes cluster with public ip option on azure and to my surprise native_transport_address was setup as private_ip there.

0 Likes 0 ·
Show more comments