[FOLLOW UP QUESTION TO #6283]
If read-before-write is not done for Sets / Maps, then how is the "uniqueness" maintained during a write?
If read-before-write is not done for Sets / Maps, then how is the "uniqueness" maintained during a write?
Cassandra does not need to read the contents of a set
or map
when updating the collection because the elements which already exist don't matter.
Let me illustrate it with a table containing friends:
CREATE TABLE friends_by_user ( user text PRIMARY KEY, friends set<text> )
It doesn't matter that I insert an unordered list of friends:
cqlsh> INSERT INTO friends_by_user (user, friends) VALUES ( 'erick', { 'tom', 'dick', 'harry', 'sally' } );
When I retrieve the data, it will be returned in lexical order:
cqlsh> SELECT * FROM friends_by_user WHERE user = 'erick'; user | friends -------+----------------------------------- erick | {'dick', 'harry', 'sally', 'tom'}
By extension, adding a new element to the collection won't matter where it goes because there's no particular order:
cqlsh> UPDATE friends_by_user SET friends = friends + {'alice'} WHERE user = 'erick';
The order doesn't matter because it gets sorted when it is time to read the data:
cqlsh> SELECT * FROM friends_by_user WHERE user = 'erick'; user | friends -------+-------------------------------------------- erick | {'alice', 'dick', 'harry', 'sally', 'tom'}
And since a set
collection only stores unique elements, adding Alice again won't make a difference:
cqlsh> UPDATE friends_by_user SET friends = friends + {'alice'} WHERE user = 'erick';
cqlsh> SELECT * FROM friends_by_user WHERE user = 'erick'; user | friends -------+-------------------------------------------- erick | {'alice', 'dick', 'harry', 'sally', 'tom'}
A map
collection stores key-value pairs. The key is required to store and retrieve the corresponding value so by definition, the key is unique.
I'll use this table of addresses to illustrate:
CREATE TABLE addresses_by_user ( user text PRIMARY KEY, addresses map<text, text> )
Inserting Jack's home and work addresses:
cqlsh> INSERT INTO addresses_by_user (user,addresses) \ VALUES ('jack', \ {'home':'100 Main St', 'work': '1 5th Ave'} \ );
we get:
cqlsh> SELECT * FROM addresses_by_user WHERE user = 'jack'; user | addresses ------+---------------------------------------------- jack | {'home': '100 Main St', 'work': '1 5th Ave'}
Adding another address isn't necessary for Cassandra to read what's already stored because it will get added as a new key-value pair:
cqlsh> UPDATE addresses_by_user \ SET addresses = addresses + {'other':'300 Rodeo Dr'} \ WHERE user = 'jack';
Updating the value of an existing address again doesn't require a read-before-write because C* just adds a new mutation to an existing key's value with a new write timestamp. In this example, we're updating the value of Jack's other
address:
cqlsh> UPDATE addresses_by_user \ SET addresses['other'] = '10 Downing St' \ WHERE user = 'jack';
I hope this clarifies it for you. Cheers!
6 People are following this question.
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2022 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
Privacy Policy Terms of Use