Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

supadhaya avatar image
supadhaya asked Erick Ramirez commented

Can DSBulk upload data to a remote Cassandra cluster?

Hi

I want to know if dsbulk should be present on one of the nodes of cassandra server to upload data or can it upload the data when present on another machine?

I try to use it from another system and get error. My command is:

C:\dsbulk-1.7.0\bin>dsbulk count -h '10.XX.XX.XX' -port 9042 -u cassandra -p cassandra -k test_keyspace -t test;

Errors:

Username and password provided but auth provider not specified, inferring PlainTextAuthProvider
Operation directory: C:\dsbulk-1.7.0\bin\logs\COUNT_20210130-014525-794000
Ignoring invalid contact point '10.XX.XX.XX':9042 (unknown host '10.XX.XX.XX')
[driver] Error connecting to Node(endPoint=/127.0.0.1:9042, hostId=null, hashCode=e37ebe), trying next node (ConnectionInitException: [driver|control|connecting...] Protocol initialization request, step 1 (OPTIONS): failed to send request (java.nio.channels.ClosedChannelException))
Operation COUNT_20210130-014525-794000 failed: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node(endPoint=/127.0.0.1:9042, hostId=null, hashCode=e37ebe): [com.datastax.oss.driver.api.core.connection.ConnectionInitException: [driver|control|connecting...] Protocol initialization request, step 1 (OPTIONS): failed to send request (java.nio.channels.ClosedChannelException)].
Suppressed: [driver|control|connecting...] Protocol initialization request, step 1 (OPTIONS): failed to send request (java.nio.channels.ClosedChannelException).
Suppressed: Connection refused: no further information: /127.0.0.1:9042.
Caused by: Connection refused: no further information.
Caused by: Channel is closed.

Is something wrong in the command or dsbulk does not work locally?

Pls let me know.

Thanks!!

dsbulk
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

Yes, DSBulk is a client that can connect to any Cassandra cluster.

You haven't provided much information in your post but I suspect your cluster is not configured correctly and you haven't configured the nodes to listen for CQL clients on an IP address so it defaults to localhost (127.0.0.1) so the nodes are not accessible remotely.

If you're running Apache Cassandra, make sure you set the rpc_address to the node's IP address (native_transport_address for DSE nodes). Cheers!

4 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi

Thanks for responding.

I've Apache cassandra installed on 10.XX.XX.XX and have these settings in the cassandra.yaml:

start_rpc: true

rpc_address: 0.0.0.0

rpc_port: 9160

broadcast_rpc_address: 10.XX.XX.XX

broadcast_rpc_address: 10.XX.XX.XX


With the above setting I can connect to the DB from an application running on 10.yy.yy.yy using the DataStax C# driver. But from this same machine if I run the dsbulk, I get the error mentioned in my original post.

I even tried by setting rpc_address to 10.xx.xx.xx but no change.

Any other settings that I need to check for?

Thanks

0 Likes 0 ·

Please run this command on the 10.x.x.x node and attach (do not paste) the output file it generates:

$ netstat -tlnp > netstat-`hostname -i`.txt
0 Likes 0 ·
supadhaya avatar image supadhaya Erick Ramirez ♦♦ ·

Hi Erik,

Thx for replying.

I've installed Cassandra on Windows system.

The cmd you asked to run was not working.

I ran: netstat -tnp tcp > result1.txt

Attaching the result.result1.txt

netstat -a > result2.txt

result2.txt

Pls let me know if u need any other details.

Thx

0 Likes 0 ·
result1.txt (1.3 KiB)
result2.txt (5.0 KiB)

You don't need to start the RPC server because that's for the deprecated Thrift protocol and you don't need to configure the broadcast address so these are not required:

start_rpc: true
rpc_port: 9160
broadcast_rpc_address: 10.XX.XX.XX

Cassandra is no longer officially supported on Windows because there are so many issues with it. If you want to persist, just set:

listen_address: 10.x.x.x
rpc_address: 10.x.x.x

But if you're just trying out DSBulk on Cassandra, I recommend you use Astra. It has a FREE forever tier with no credit card required. You can launch a Cassandra cluster with just a few clicks. Cheers!

0 Likes 0 ·