Hi I have table called customer. This table having 5 Million (5,000,000) or records. I want to update a particular field of all records.
Please suggest me a fastest way to update.
Thanks!
If I may suggest, speed shouldn't be your goal -- it should be efficiency. Updating all the records in a table requires a full table scan and you could inadvertently bring your cluster down.
There isn't a ready-made solution for doing this. You could export the data out using the DataStax Bulk Loader (DSBulk) tool. Based on the output, create a new CSV file that contains the partition key(s) and the column you want to update. You can then bulk load it into the cluster using DSBulk.
A more efficient way of doing it is with Spark if you already have a Spark cluster. Cheers!
7 People are following this question.
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2023 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
Privacy Policy Terms of Use