I'm trying to add a new datacenter in Azure. All the configurations are done following this document.
While running the nodetool rebuild command, I'm getting the below exception after around 30-40 minutes. I noticed that it's happening while replicating very large files (approx. 200-300 GB).
ERROR [STREAM-OUT-/<source_ip>:7000] 2021-05-07 15:39:24,969 StreamSession.java:609 - [Stream #e66ad4d0-af44-11eb-b016-0fdd3f794cc3] Streaming error occurred on session with peer <source_ip> java.io.IOException: Connection timed out at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_292] at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_292] at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_292] at sun.nio.ch.IOUtil.write(IOUtil.java:51) ~[na:1.8.0_292] at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470) ~[na:1.8.0_292] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.doFlush(BufferedDataOutputStreamPlus.java:323) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.flush(BufferedDataOutputStreamPlus.java:331) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:409) [apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:380) [apache-cassandra-3.11.10.jar:3.11.10] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_292] ERROR [STREAM-IN-/<source_ip>:7000] 2021-05-07 15:39:30,929 StreamSession.java:609 - [Stream #e66ad4d0-af44-11eb-b016-0fdd3f794cc3] Streaming error occurred on session with peer <source_ip> java.lang.RuntimeException: Stream receive task e66ad4d0-af44-11eb-b016-0fdd3f794cc3 of cf 57722510-658b-11eb-9058-07778e41ebc3 already finished. at org.apache.cassandra.streaming.StreamReceiveTask.createLifecycleNewTracker(StreamReceiveTask.java:145) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.streaming.StreamReader.createWriter(StreamReader.java:155) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:92) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:54) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:43) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:61) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:311) ~[apache-cassandra-3.11.10.jar:3.11.10] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_292] ERROR [STREAM-OUT-/<source_ip>:7000] 2021-05-07 15:39:30,930 StreamSession.java:609 - [Stream #e66ad4d0-af44-11eb-b016-0fdd3f794cc3] Streaming error occurred on session with peer <source_ip> java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_292] at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.8.0_292] at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.8.0_292] at sun.nio.ch.IOUtil.write(IOUtil.java:51) ~[na:1.8.0_292] at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470) ~[na:1.8.0_292] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.doFlush(BufferedDataOutputStreamPlus.java:323) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.flush(BufferedDataOutputStreamPlus.java:331) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:409) [apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:388) [apache-cassandra-3.11.10.jar:3.11.10] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_292]
I also tried to modify below values:
net.ipv4.tcp_keepalive_time = 7200 net.ipv4.tcp_keepalive_intvl = 75 net.ipv4.tcp_keepalive_probes = 9
And in cassandra.yaml, I updated following values:
streaming_keep_alive_period_in_secs: 10000 streaming_socket_timeout_in_ms: 86400000
But still I get the exception and not able to stream all the data.
Does anybody have any idea how this can be fixed? Thanks in advance.