Hi,
We are evaluating using datastax bulk loader (dsbulk) for loading bulk csv data into cassandra cluster. But, we are facing an issue while loading the data.
If we use, "dsbulk-1.4.1/bin/dsbulk load -f xxxxxx.conf -url xxxxxx.csv -k xxxxxxx -t xxxxxxx -h ‘x.x.x.x,x.x.x.x,x.x.x.x' -header true" - we see an error in the logs as
Source: "2fa67c1df1913c24",0,0,0,0,0,0,2,0,0\u000d java.lang.IllegalArgumentException: Expecting record to contain 10 fields but found 11. at com.datastax.dsbulk.connectors.api.internal.DefaultRecord.<init>(DefaultRecord.java:125) at com.datastax.dsbulk.connectors.api.internal.DefaultRecord.mapped(DefaultRecord.java:58) at com.datastax.dsbulk.connectors.csv.CSVConnector.lambda$readURL$6(CSVConnector.java:523) at com.datastax.dsbulk.engine.LoadWorkflow.parallelFlux(LoadWorkflow.java:258) [20 skipped] at com.datastax.dsbulk.engine.LoadWorkflow.execute(LoadWorkflow.java:191) at com.datastax.dsbulk.engine.DataStaxBulkLoader$WorkflowThread.run(DataStaxBulkLoader.java:128)
If we use, "dsbulk-1.4.1/bin/dsbulk load -f xxxxxx.conf -url xxxxxx.csv -k xxxxxxx -t xxxxxxx -h ‘x.x.x.x,x.x.x.x,x.x.x.x' -header true -newline '\u000d'" , the data is getting loaded, but when I query the database , I see \n and "" also in the column value
select idx from keyspace.table idx ---------------------------------------------------- \n"59976eb7.a787.4ba5.ba95.37e8fd3e91afU9ou2fRxx2" \n"18ce047c7556123f" \n"7b434da2.4f45.4013.bb65.987b8d83b6a6lv5DhEPNld" \n"fa8cd5bd.2a6c.4b28.8f20.b07a9a3864c1zgUl6p7R2k" \n"99b6b8c2.3cfe.4b64.a971.08a7c735b54bb0eVLzzPjV" \n"2b3860fa.0717.407c.bf3b.018e545e373bNJ7cInZaXX" \n"af2eea0560ec9082"
We don't want new line character(\n) as part of the idx field.Can you please let us know what configuration should I be using to get this issue resolved ?