I'm learning the datastax node.js driver, and I don't really see any place in the documentation about model associations. How can I make a user object, and then associate many 'friends' with a particular user? Then query friends? Add more friends?
I'm learning the datastax node.js driver, and I don't really see any place in the documentation about model associations. How can I make a user object, and then associate many 'friends' with a particular user? Then query friends? Add more friends?
Hi,
There are 2 parts in this question.
How to handle associations in Cassandra ?
In Cassandra there is no such things as integrity constraints (unicity or consistency). There are no foreign keys. As such, we need to rethink the data model to design associations.
First solution is to use a single table and nested the list of friends as a sub type User Defined Type (COMPOSITION) like here
CREATE TYPE IF NOT EXISTS friend ( friendname text, firstname text, lastname text ); CREATE TABLE IF NOT EXISTS user( username text, email text, friends set<frozen<friend>>, PRIMARY KEY (username) );
With this model this is how you perform CRUD operations
// Create INSERT INTO user(username, email, friends) VALUES('Cedrick','clu@sample.com', { {friendname:'Bob',firstname:'fff', lastname:'lll'}, {friendname:'Bill',firstname:'ddd', lastname:'kkk'}}); // List friends for Cedrick SELECT friends FROM user WHERE username='Cedrick' // Add a friend to Cedrick UPDATE user SET friends = friends + { {friendname:'Jack',firstname:'ddd',lastname:'kkk'}} WHERE username='Cedrick'; // Remove a friend UPDATE killrvideo.user SET friends = friends - { {friendname:'Jack',firstname:'ddd',lastname:'kkk'}} WHERE username='Cedrick'; // Replace the friend lists UPDATE killrvideo.user SET friends = { {friendname:'Jack',firstname:'ddd',lastname:'kkk'}} WHERE username='Cedrick';
So at the coding level there is not much to check if you add friends
in the select statements everything is retrieved as a single query which is fine because all the data is located at the same place, in the same node.
Second solution solution is to denormalize the association into single table (AGGREGATION) and this will always be the go to for many-to-many but let's see what it look likes for one-to-many:
CREATE TABLE IF NOT EXISTS friends_by_user ( username text, user_email text static, friendname text, friend_firstname text, friend_lastname text, PRIMARY KEY ((username), friendname) ); // Create INSERT INTO friends_by_user(username, user_email, friendname, friend_firstname, friend_lastname) VALUES ('Cedrick','clu@sample.com', 'Bob', 'fff','lll'); INSERT INTO friends_by_user(username, user_email, friendname, friend_firstname, friend_lastname) VALUES ('Cedrick','clu@sample.com', 'Bill', 'ddd','kkk'); // List friends for Cedrick SELECT friends FROM friends_by_user WHERE username='Cedrick' // Add a friend to Cedrick INSERT INTO friends_by_user(username, user_email, friendname, friend_firstname, friend_lastname) VALUES ('Cedrick','clu@sample.com', 'Jack', 'ddd','kkk'); // Remove a friend DELETE FROM friends_by_user WHERE username='Cedrick' AND friendname='Jack' // Replace the friend lists // delete and recreate
Some note:
So here we will have 2 queries. "Get me the list of users for a group" and "get me the list of groups for user". 2 queries means 2 table in the Cassandra world.
CREATE TABLE IF NOT EXISTS groups_by_user ( username text, email text, groups set<text>, PRIMARY KEY (username) ); CREATE TABLE IF NOT EXISTS users_by_group ( groupname text, users set<text>, PRIMARY KEY (groupname) );
To group operations on multiple tables we use batches like below
BEGIN BATCH INSERT INTO groups_by_user(username, email, groups) VALUES('Cedrick','clu@sample.com', {'USER', 'ADMIN'}); INSERT INTO groups_by_user(username, email, groups) VALUES('Erick','era@sample.com', {'USER'}); INSERT INTO users_by_group(groupname, users) VALUES('USER', {'Cedrick','Erick'}); INSERT INTO users_by_group(groupname, users) VALUES('ADMIN', {'Cedrick'}); APPLY BATCH;
When you edit a group or a user you have to update both table using the BACTH.
Coding
Remember there are no links, no joins between tables and the logic to fetch sub value by using getters and setters here made not sense.
You will create an object per table and always retrieve everything you need in one go. This is fast because all info related to a record is at the same place, same node.
I will amend the answer when I do have dedicated code for this sample.
5 People are following this question.
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2023 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.
Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries.
Privacy Policy Terms of Use