PLANNED MAINTENANCE

Hello, DataStax Community!

We want to make you aware of a few operational updates which will be carried out on the site. We are working hard to streamline the login process to integrate with other DataStax resources. As such, you will soon be prompted to update your password. Please note that your username will remain the same.

As we work to improve your user experience, please be aware that login to the DataStax Community will be unavailable for a few hours on:

  • Wednesday, July 15 16:00 PDT | 19:00 EDT | 20:00 BRT
  • Thursday, July 16 00:00 BST | 01:00 CEST | 04:30 IST | 07:00 CST | 09:00 AEST

For more info, check out the FAQ page. Thank you for being a valued member of our community.


question

Cedrick Lunven avatar image
Cedrick Lunven asked ·

How to handle associations with Cassandra ?

I'm learning the datastax node.js driver, and I don't really see any place in the documentation about model associations. How can I make a user object, and then associate many 'friends' with a particular user? Then query friends? Add more friends?

collections
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Cedrick Lunven avatar image
Cedrick Lunven answered ·

Hi,

There are 2 parts in this question.

  • How to handle associations in Cassandra ?
  • How to use them with the nodejs driver ?


How to handle associations in Cassandra ?

In Cassandra there is no such things as integrity constraints (unicity or consistency). There are no foreign keys. As such, we need to rethink the data model to design associations.


  • one-to-many : For this queries you will always find the list of friends for a dedicated user.

First solution is to use a single table and nested the list of friends as a sub type User Defined Type (COMPOSITION) like here

CREATE TYPE IF NOT EXISTS friend (
 friendname text, 
 firstname text, 
 lastname text
);

CREATE TABLE IF NOT EXISTS user( 
   username text, 
   email text, 
   friends set<frozen<friend>>, 
   PRIMARY KEY (username) 
); 

With this model this is how you perform CRUD operations

// Create INSERT INTO user(username, email, friends) VALUES('Cedrick','clu@sample.com', {
                 {friendname:'Bob',firstname:'fff', lastname:'lll'}, {friendname:'Bill',firstname:'ddd', lastname:'kkk'}}); 

// List friends for Cedrick 
SELECT friends FROM user WHERE username='Cedrick' 

// Add a friend to Cedrick 
UPDATE user 
SET friends = friends + {
                 {friendname:'Jack',firstname:'ddd',lastname:'kkk'}} 
WHERE username='Cedrick'; 

// Remove a friend 
UPDATE killrvideo.user 
SET friends = friends - {
                 {friendname:'Jack',firstname:'ddd',lastname:'kkk'}} 
WHERE username='Cedrick'; 

// Replace the friend lists 
UPDATE killrvideo.user 
SET friends = {
                 {friendname:'Jack',firstname:'ddd',lastname:'kkk'}} 
WHERE username='Cedrick';

So at the coding level there is not much to check if you add friends in the select statements everything is retrieved as a single query which is fine because all the data is located at the same place, in the same node.


Second solution solution is to denormalize the association into single table (AGGREGATION) and this will always be the go to for many-to-many but let's see what it look likes for one-to-many:

CREATE TABLE IF NOT EXISTS friends_by_user ( 
   username text, 
   user_email text static, 
   friendname text, 
friend_firstname text, 
friend_lastname text, 
PRIMARY KEY ((username), friendname) 
); 

// Create 
INSERT INTO friends_by_user(username, user_email, friendname, friend_firstname, friend_lastname) VALUES ('Cedrick','clu@sample.com', 'Bob', 'fff','lll'); 
INSERT INTO friends_by_user(username, user_email, friendname, friend_firstname, friend_lastname) VALUES ('Cedrick','clu@sample.com', 'Bill', 'ddd','kkk'); 

// List friends for Cedrick 
SELECT friends 
FROM friends_by_user 
WHERE username='Cedrick' 

// Add a friend to Cedrick 
INSERT INTO friends_by_user(username, user_email, friendname, friend_firstname, friend_lastname) VALUES ('Cedrick','clu@sample.com', 'Jack', 'ddd','kkk'); 

// Remove a friend 
DELETE FROM friends_by_user WHERE username='Cedrick' AND friendname='Jack' 

// Replace the friend lists 
// delete and recreate

Some note:

  • username is the Partition key, all friends for the same user will be located in the same partition, on the same node (fast)
  • user_email is a static column because it will be replacted with same value for each record in the partition
  • you have one record per friend.


  • many-to-many : For this one let's use a user/group sample. A user can have multiple groups and a groups can have multiple users


So here we will have 2 queries. "Get me the list of users for a group" and "get me the list of groups for user". 2 queries means 2 table in the Cassandra world.

CREATE TABLE IF NOT EXISTS groups_by_user ( 
username text, 
email text, 
groups set<text>, 
PRIMARY KEY (username) 
); 

CREATE TABLE IF NOT EXISTS users_by_group ( 
groupname text, 
users set<text>, 
PRIMARY KEY (groupname) 
);

To group operations on multiple tables we use batches like below

BEGIN BATCH 
INSERT INTO groups_by_user(username, email, groups) 
VALUES('Cedrick','clu@sample.com', {'USER', 'ADMIN'}); 
INSERT INTO groups_by_user(username, email, groups) 
VALUES('Erick','era@sample.com', {'USER'}); 
INSERT INTO users_by_group(groupname, users) 
VALUES('USER', {'Cedrick','Erick'}); 
INSERT INTO users_by_group(groupname, users) 
VALUES('ADMIN', {'Cedrick'}); 
APPLY BATCH;

When you edit a group or a user you have to update both table using the BACTH.


Coding

Remember there are no links, no joins between tables and the logic to fetch sub value by using getters and setters here made not sense.

You will create an object per table and always retrieve everything you need in one go. This is fast because all info related to a record is at the same place, same node.


  1. For mapping objects to table here is the documentation https://docs.datastax.com/en/developer/nodejs-driver/4.4/api/module.mapping/class.ModelMapper/
  2. Now you have a full working nodeJS application here to help you : https://github.com/KillrVideo/killrvideo-nodejs
  3. We do have tons of snippet and sample code for JAVA illustrating this behaviour and in the process the build the same for NodeJs here : https://github.com/clun/java-cassandra-driver-from3x-to4x/tree/master/example-4x/src/main/java/com/datastax/samples

I will amend the answer when I do have dedicated code for this sample.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.