Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

mjcarey_178232 avatar image
mjcarey_178232 asked Erick Ramirez commented

Can you provide a concrete example of loading data to Astra with DSBulk?

I am trying to do a first-time upload of CSV data into my Astra database. The documentation on this is a bit thin and sketchy - I'd really need to look at a few concrete examples. Unfortunately, all of the dsbulk examples seem to be non-Astra examples! Help!!!? I am trying various things, but being new to Astra, I'm sure I'm missing something basic and it's kinda unlikely that I'm going to un-miss it without an assist. The command I am trying (minus the actual password) is:

dsbulk load -url boats.csv -k hoofers -t boats -b "/Users/mikejcarey/datastax/secure-connect-cs122d-class.zip" -u mjcarey@ics.uci.edu -p mypassword -header true

Things I'm not sure of include:

1. Should that path to the secure connect bundle include its filename (as I am doing)?

2. Is the user info that I should provide the one with which I registered on Astra (which is what I am showing above)?

3. How does dsbulk know where the Astra endpoints are to connect to? Are they in the secure connect bundle somewhere?

I am getting an auth failure and I'm not sure how to debug this problem. Once I get my first CSV to go I will be over the initial hump.... (And I'm not sure to make of the "we recently improved your database security" message.)

Here's the failure. Help???

Operation LOAD_20210401-235410-149835 failed: Could not reach any contact point, make sure you've provided valid addresses (showing first 3 nodes, use getAllErrors() for more): Node(endPoint=fe1ab7b0-9a39-4ea5-9d7f-c510e474d926-us-east-1.db.astra.datastax.com:29042:ecf2c587-1816-495f-832e-9f78ff1bfa70, hostId=null, hashCode=5211ac94): [com.datastax.oss.driver.api.core.auth.AuthenticationException: Authentication error on node fe1ab7b0-9a39-4ea5-9d7f-c510e474d926-us-east-1.db.astra.datastax.com:29042:ecf2c587-1816-495f-832e-9f78ff1bfa70: server replied with 'We recently improved your database security. To find out more and reconnect, see https://docs.datastax.com/en/astra/docs/manage-application-tokens.html.' to AuthResponse request], Node(endPoint=fe1ab7b0-9a39-4ea5-9d7f-c510e474d926-us-east-1.db.astra.datastax.com:29042:68a5d184-2cf0-435d-9d64-e0cef4163049, hostId=null, hashCode=5a56c7e6): [com.datastax.oss.driver.api.core.auth.AuthenticationException: Authentication error on node fe1ab7b0-9a39-4ea5-9d7f-c510e474d926-us-east-1.db.astra.datastax.com:29042:68a5d184-2cf0-435d-9d64-e0cef4163049: server replied with 'We recently improved your database security. To find out more and reconnect, see https://docs.datastax.com/en/astra/docs/manage-application-tokens.html.' to AuthResponse request], Node(endPoint=fe1ab7b0-9a39-4ea5-9d7f-c510e474d926-us-east-1.db.astra.datastax.com:29042:c3ef9395-aea4-4c49-b046-a0aad26f6c50, hostId=null, hashCode=2904cf64): [com.datastax.oss.driver.api.core.auth.AuthenticationException: Authentication error on node fe1ab7b0-9a39-4ea5-9d7f-c510e474d926-us-east-1.db.astra.datastax.com:29042:c3ef9395-aea4-4c49-b046-a0aad26f6c50: server replied with 'We recently improved your database security. To find out more and reconnect, see https://docs.datastax.com/en/astra/docs/manage-application-tokens.html.' to AuthResponse request].
   Suppressed: Authentication error on node fe1ab7b0-9a39-4ea5-9d7f-c510e474d926-us-east-1.db.astra.datastax.com:29042:ecf2c587-1816-495f-832e-9f78ff1bfa70: server replied with 'We recently improved your database security. To find out more and reconnect, see https://docs.datastax.com/en/astra/docs/manage-application-tokens.html.' to AuthResponse request.
   Suppressed: Authentication error on node fe1ab7b0-9a39-4ea5-9d7f-c510e474d926-us-east-1.db.astra.datastax.com:29042:68a5d184-2cf0-435d-9d64-e0cef4163049: server replied with 'We recently improved your database security. To find out more and reconnect, see https://docs.datastax.com/en/astra/docs/manage-application-tokens.html.' to AuthResponse request.
   Suppressed: Authentication error on node fe1ab7b0-9a39-4ea5-9d7f-c510e474d926-us-east-1.db.astra.datastax.com:29042:c3ef9395-aea4-4c49-b046-a0aad26f6c50: server replied with 'We recently improved your database security. To find out more and reconnect, see https://docs.datastax.com/en/astra/docs/manage-application-tokens.html.' to AuthResponse request.
astradsbulk
2 comments
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Why does this question keep getting "moved to moderation" where it seems not to be visible???

0 Likes 0 ·

When you post a large amount of text, the system flags it as potential spam and sends it to the moderation queue for review.

Moderators need to review the post and approve it as appropriate for it to show on the site. Cheers!

0 Likes 0 ·

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

You are seeing authentication failures because you are using what I assume is your username and password for logging on to the Astra web UI.

And yes, the secure bundle contains the contact points. If you unzip the bundle, you will see that one of the files is cqlshrc which has the contact point and CQL port. Here's an example extract:

[connection]
hostname = 31fecf38-2491-4d43-b6ce-22562679f1b8-us-east1.db.astra.datastax.com
port = 34567
ssl = true

As the error message states, Astra was recently updated to use Identity and Access Management (IAM). You will need to generate an application token and use the client ID + secret in place of the username/password to connect to your database.

I apologise that this isn't clear in the documentation. I will work with the Docs team to get this fixed. In the meantime, here is a full working example of how I loaded data to my Astra database.

Prerequisites

Schema

For the purposes of my example, I have a keyspace community and table users:

CREATE TABLE community.users (
    username text,
    realname text,
    email text,
    PRIMARY KEY (username)
)

Data

Here are the contents of my users.csv:

username,realname,email
alice,Alice Bautista,alice@acme.com
bob,Bob Adams,bob.adams@mail.co
charlie,Charlie Choi,cchoi@gotmail.com

App token

On the Astra UI, select your database and click on the Settings tab to generate a new token.

In my case, I generated a token with just read/write access:

c10901-astra-generate-token.png

IMPORTANT: Download the credentials in CSV format since you will not be able to view the details again once you close the window.

Load to Astra

Step 1 - Create the table schema on Astra.

Step 2 - Generate an app token.

Step 3 - Download the secure connect bundle to your local machine.

Step 4 - On your local machine, create the users.csv data file.

Step 5 - Using the secure bundle and client ID + secret, load the data as follows:

$ dsbulk load -url /path/to/users.csv -header true -k community -t users -b "/path/to/secure-connect-db.zip" -u client_id -p client_secret

In my case, the output was:

Username and password provided but auth provider not specified, inferring PlainTextAuthProvider
A cloud secure connect bundle was provided: ignoring all explicit contact points.
A cloud secure connect bundle was provided and selected operation performs writes: changing default consistency level to LOCAL_QUORUM.
Operation directory: /home/ubuntu/dsbulk-1.5.0/logs/LOAD_20210406-032949-502743
total | failed | rows/s |  p50ms |  p99ms | p999ms | batches
    3 |      0 |      2 | 267.56 | 278.92 | 278.92 |    1.00
Operation LOAD_20210406-032949-502743 completed successfully in 0 seconds.
Last processed positions can be found in positions.txt

I connected to the CQL Console on the Astra UI to verify that the data was loaded to the table:

token@cqlsh> SELECT * FROM community.users ;

 username | email             | realname
----------+-------------------+----------------
      bob | bob.adams@mail.co |      Bob Adams
  charlie | cchoi@gotmail.com |   Charlie Choi
    alice |    alice@acme.com | Alice Bautista

Let me know how it goes. Cheers!


6 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Now that's a 5-star answer - thanks!!!

0 Likes 0 ·

You're welcome. Cheers!

0 Likes 0 ·

Knowing this (how to manage and specify the auth stuff) I was also able to download and operate the standalone cqlsh interface for CQL and to connect it to Astra on my first try. Thx again!

0 Likes 0 ·

Glad to hear it worked. I'm working in the background on getting the official docs revised. Cheers!

0 Likes 0 ·
mjcarey_178232 avatar image mjcarey_178232 Erick Ramirez ♦♦ ·

I did encounter one issue for one file I wanted to upload - and I ended up having to add

--executor.maxPerSecond 1000

to the dsbulk command call. Apparently it was trying to load too fast for the free tier of Astra. Might be worth a mention in the official docs?

0 Likes 0 ·
Show more comments