I want to upload nested json in dse graph using dsbulk from csv file. but not able to upload data.
I have json data as string and my final goal is to upload this as text in Account vertex's data part.
"{""Account ID"":""ASONAR"",""VOL ID"":""somename"",""Persona ID"":""Sonar"",""Account Login"":""ASonar@csomename.com""}"
I have schema for vertex Account like this:
schema.vertexLabel('Account'). ifNotExists(). partitionBy('tenantId', Ascii). partitionBy('appId', Ascii). partitionBy('nativeType', Ascii). clusterBy('entityGlobalId', Uuid, Asc). property('data', Text). property('displayName', Text). create()
//Ensure no searches are made without partitioning key schema.vertexLabel('Account'). searchIndex(). ifNotExists(). by('nativeId').asString(). by('displayName').asText(). by('status'). waitForIndex(30). create()
and I am trying to upload this data lets say:
tenantId,appId,nativeType,entityGlobalId,entityKey,Status,metaType,nativeStatus,updateTime,nativeAsOnTime,nativeModifiedOnTime,createTime,nativeId,displayName,data Default,someId,Service Account,6ab03f78-8da0-438c-b619-2453320dbcd4,'ASONAR',Active,Service Account,,,05-11-18 12:18:43.000000 PM,05-11-18 12:18:43.000000 PM,05-11-18 12:19:38.738000 PM,'ASONAR','ASONAR',"{""Account ID"":""ASONAR"",""V ID"":""Abhishek"",""Persona ID"":""Sonar"",""Account Login"":""Asomeone@someone.com""
Using this:
>dsbulk load -url D:\\dse_6.8\\iap_schema\\Account_schema\\vertcies\\tryAccount.csv -g iapTest -v Account -header true -h 10.1.27.44 -delim "," --schema.allowMissingFields true -u confluxsys -p passw0rd --driver.advanced.auth-provider.class DsePlainTextAuthProvider
got this:
[s0] Error while computing token map for replication settings {SearchGraphAnalytics=3, class=org.apache.cassandra.locator.NetworkTopologyStrategy}: could not achieve replication factor 3 for datacenter SearchGraphAnalytics (found only 1 replicas). total | failed | vertices/s | p50ms | p99ms | p999ms | batches 1 | 1 | 0 | 0.00 | 0.00 | 0.00 | 0.00 Operation LOAD_20200526-063359-270000 completed with 1 errors in 0 seconds.
Logs:
Source: Default,CAMR-VOLVisaRiskManager,Service Account,6ab03f78-8da0-438c-b619-2453320dbcd4,'ASONAR',Active,Service Account,,,05-11-18 12:18:43.000000 PM,05-11-18 12:18:43.000000 PM,05-11-18 12:19:3 8.738000 PM,'ASONAR','ASONAR',"{""Account ID"":""ASONAR"",""VOL ID"":""Abhishek"",""Persona ID"":""Sonar"",""Account Login"":""Abhishek.Sonar@confluxsys.com""}" java.lang.IllegalArgumentException: Expecting record to contain 15 fields but found 18. at com.datastax.dsbulk.connectors.api.internal.DefaultRecord.<init>(DefaultRecord.java:125) at com.datastax.dsbulk.connectors.api.internal.DefaultRecord.mapped(DefaultRecord.java:58) at com.datastax.dsbulk.connectors.csv.CSVConnector.lambda$readSingleFile$1(CSVConnector.java:244) at com.datastax.dsbulk.engine.LoadWorkflow.parallelFlux(LoadWorkflow.java:259) [23 skipped] at com.datastax.dsbulk.engine.LoadWorkflow.execute(LoadWorkflow.java:192) at com.datastax.dsbulk.engine.DataStaxBulkLoader$WorkflowThread.run(DataStaxBulkLoader.java:128)
_____________________________
when I do using delimiter "|" then I got this:
[s0] Unexpected error while refreshing schema during initialization, keeping previous version (CompletionException: com.datastax.oss.driver.api.core.DriverTimeoutException: query 'SELECT * FROM system_schema.tables' timed out after PT2S) Operation LOAD_20200526-065323-637000 failed: Keyspace "iapTest" does not exist. <<<
I also add " for making this as whole string but .it didn't work.