DataStax Academy FAQ

DataStax Academy migrated to a new learning management system (LMS) in July 2020. We are also moving to a new Cassandra Certification process so there are changes to exam bookings, voucher system and issuing of certificates.

Check out the Academy FAQ pages for answers to your questions:


question

aditya.magotra_191687 avatar image
aditya.magotra_191687 asked ·

How do I delete data from my table given the partition key and a non-clustering column?

Hi Team,

I am quite new to Cassandra database. I have a question related to use of cassandra.

Table structure looks like below :-

  • Table Name :- Product Details.
  • ProductFamily Text,
  • AccessGroup Text,
  • ProductDetails Map<text,text>
  • ((ProductFamily), AccessGroup) PRIMARY Key

Data Relation: For 1 Product family we have multiple Access Groups and each access group has product details in Map <ProductId, Details>. It is quite possible 1 product detail is present in all the access groups or some of the access groups.

Scenario: We receive a delete event with ProductId and product family only.

Our implementation:

  • Fetch all access group of the product family from the database.
  • For each access group, hit database to get the map, then we are checking whether it has specific productid as map key.
  • If yes, then hold that accessgroup -> productid (key,value) pair in memory.
  • In the end, prepare batch statement to delete all the product ids for the access group because our partition key is same.

Note: Max. we have 15-20 items in a map and 8-10 access groups with a product family.

Questions:

  1. Could you please let me know whether am I following right approach for batch deletion ?
  2. If we receive thousands of such events in a day whether this approach is performant ?

Thanks in advance.


Update:

Our requirement is like this. For each product family and access group fetch all the product details from the system. e.g when customer selects specific product family in the web page we can check the access group of that customer and on that basis we will display the product details. (Read model)

Note : It might possible same product detail is present for different access group but for same product family.

Write Model :-

Our application is completely event based system. We are receiving following messages :-

1. When a product detail is added for a product family.

Note :- In this scenario, need to read access groups first from cassandra db to upsert the product detail for all access groups.

2. When a product detail is added to a specific access group.

3. When a product detail is removed for product family. In this case, it creates tombstones.


Can you please share some design alternatives.

Thanks in advance.

cassandradata modeling
1 comment
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi @aditya.magotra_191687 ,

I have updated your original question with your follow-up as it was too long for a comment.

Cheers!

1 Like 1 · ·

1 Answer

saravanan.chinnachamy_185977 avatar image
saravanan.chinnachamy_185977 answered ·

@aditya.magotra_191687 Your data model seems to be not very efficient for Cassandra. Cassandra data modeling is very different from RDBMS database. your current model seems to involve multiple reads, wide row, deletes leading to tombstones. if you can share some more details on your requirements, we can suggest some design alternatives.

Cassandra data modeling concepts are based on

  • Query-driven modeling : Query is identified first and then a table is designed to satisfy the query. This can lead to data duplication which is perfectly ok.
  • Minimize the number of partitions read : When you issue a read query, you want to read rows from as few partitions as possible.
  • Spread data evenly around the cluster: You want every node in the cluster to have roughly the same amount of data.
  • Maximize Writes: Writes are cheap in Cassandra and use writes to achieve faster reads.
  • Minimize Tombstones: Any delete leads to tombstone and large tombstones are extremely detrimental to reads. Deletes should be carefully managed. Model tables in such way to minimize deletes.

Please review Cassandra data modeling concepts in detail at the following resources.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.