Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

teriksson avatar image
teriksson asked ·

Unable to add node due to broken pipe / stream when syncing data from other nodes

We are having difficulties adding a C* node to the cluster

After only a few minutes the stream is broken

ERROR [STREAM-OUT-/10.xx.xx.x9:35416] 2020-11-05 13:37:53,728 StreamSession.java:593 - [Stream #e07d93a0-1f6b-11eb-8af8-391d84d34cb8] Streaming error occurred on session with peer 10.xx.xx.x9
Streaming error occurred on session with peer 10.xx.xx.x9
org.apache.cassandra.io.FSReadError: javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe (Write failed)
Caused by: javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: java.net.SocketException: Broken pipe (Write failed)

We are running Cassandra 3.11.2

How can we troubleshoot this further ?
I read some article from 2016 about too large sections to transfer, but that was fixed in 3.3, so it should not be it.

Can we turn on some more logging to find out more ?

Or is this a known issue when doing .... and then we need to ... or something ?

Appreciate all the help I can get

add nodes
3 comments
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

@teriksson, have you checked the debug.log on that node to understand what events happened there prior to this error? Also, turning on DEBUG level logging might help you uncover what is happening.

0 Likes 0 · ·

Yes, logging I have checked, and this is what you see above

Turning in DEBUG we should try for sure, just found out that I can do that through nodetool setlogginglevel <class> Debug

But not sure which class to choose from

  • org.apache.cassandra
  • org.apache.cassandra.db
  • org.apache.cassandra.service.StorageProxy

But I guess we have to try and see which is the right one, or turn on all of them at the same time.

will this turn on the Debug level for the whole cluster or just the node I am connected to ?

0 Likes 0 · ·

@teriksson, I would start with the org.apache.cassandra.db and if it doesn't provide enough insights, will change it to org.apache.cassandra.

0 Likes 0 · ·

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

You haven't provided sufficient info for us so our ability to diagnose the problem is limited.

It would be really great if you could provide the full error + the full stack trace.

But in my experience, the SSL exception is a clue and indicates that the new node is most likely not configured correctly with the certificates and keystore/truststore. But the full stack trace would be the starting point here. Cheers!

3 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

cass.txt

1605193697009.png

@Erick Ramirez here is the stacktrace

both as text and image, reason for image was that it was so hard to get this readable as text

Any ideas of what this tells us ?

0 Likes 0 · ·
1605193697009.png (105.4 KiB)
cass.txt (4.9 KiB)

I needed to look at the full stack trace, including the section after the Caused by. :)

In any case as I said in my original response, the SSL exception is a clue that node-to-node encryption isn't configured correctly so the stream between the nodes fail. Have a look at the logs of 10.x.x.79 at exactly the same time as the stream failure because it will tell you why the connection failed.

For future reference, you can click on the code formatter button if you're using the default HTML editor to format log outputs:

c9217-editor-code-formatter.png

Cheers!

0 Likes 0 · ·
teriksson avatar image teriksson Erick Ramirez ♦♦ ·

Hi @Erick Ramirez , the full stacktrace was included , see attached text file above

Ofcourse we also thought there was an issue with the SSL setup, but as this is one of 30+ nodes, and all nodes are installed the same... well, I needed more ideas on what really is causing this, so hoping to get some tips , grasping for every straw I can get.

0 Likes 0 · ·