Hello,
maybe I have 3 Tables:
users_by_id
users_by_email
users_by_username
So, I have to write 3 write querys. But is that not expensive ? And maybe I have 11 Tables. Then I have to writes 11 write querys. Is there another solution?
Hello,
maybe I have 3 Tables:
users_by_id
users_by_email
users_by_username
So, I have to write 3 write querys. But is that not expensive ? And maybe I have 11 Tables. Then I have to writes 11 write querys. Is there another solution?
@dataserver17, could you please provide additional details on the version of Cassandra/DSE cluster and the driver language and version? Thanks!
Hello,
well, in most cases that's the least expensive solution to accommodate your needs. If you have these three tables, it means that your app workflow will need all three access patterns to get users. Most likely you will read more often than you write, so it is even more true that duplicating the data at write time is a performance gain on the long run.
There are other solutions (creating indices or even materialized views), which however, depending on the use case, may generate more load to the nodes and higher latency.
If you are concerned about having the best read performance, performing multiple writes is still your best option - and your client may issue the writes in parallel, spread over nodes compatibly with your replication factor.
In some cases you may also want to consider grouping your write statements inside a batch (in this case it would be a logged multi-table batch), but even that comes with some performance penalty.
Yes, you do need to do multiple writes to keep the denormalised tables in sync.
It is true that multiple table updates can be expensive but it is a tradeoff in preference to making reads really fast because you do not need to do lookups on multiple tables to retrieve the data -- for each of your app queries, you are just reading one table.
Cassandra can achieve single-digit millisecond latency by optimising reads to as little as a single disk seek. Unlike traditional RDBMS, denormalised tables have a preference for faster data retrievals at the expense of multiple writes. Cheers!
7 People are following this question.
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2023 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
Privacy Policy Terms of Use