PLANNED MAINTENANCE

Hello, DataStax Community!

We want to make you aware of a few operational updates which will be carried out on the site. We are working hard to streamline the login process to integrate with other DataStax resources. As such, you will soon be prompted to update your password. Please note that your username will remain the same.

As we work to improve your user experience, please be aware that login to the DataStax Community will be unavailable for a few hours on:

  • Wednesday, July 15 16:00 PDT | 19:00 EDT | 20:00 BRT
  • Thursday, July 16 00:00 BST | 01:00 CEST | 04:30 IST | 07:00 CST | 09:00 AEST

For more info, check out the FAQ page. Thank you for being a valued member of our community.


question

scherian_188962 avatar image
scherian_188962 asked ·

Which compaction strategy should I use for a table that tracks user activity which requires a read-before-write?

Sample table :

CREATE TABLE user_activity (
    user_id int,
    activity_id uuid,
    activity_timestamp timestamp,
    score int,
    PRIMARY KEY (user_id, activity_timestamp)
) WITH CLUSTERING ORDER BY (activity_timestamp DESC);

whenever a user performs a new activity, his previous activities are first read, a score is calculated and the new activity is then persisted along with the calculated score.

eg: a new user USER1 performs his 1st activity , in this scenario, zero records will be called on SELECT before persisting, when he performs the 2nd activity, his previous record will be read and score of 1 will be given, so the 2nd activity is then persisted with score value as 1.

A TTL is also given for every insert. There will be updates (only 1 time update) happening on a certain low percentage (30%) of the overall records, the time interval between an insert and update would be between 10min-1hr.

In this scenario, which is the preferred compaction strategy to be used?

version: Apache Cassandra 3.11.4

cassandracompaction
4 comments
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

@scherian_188962, can you also update your original post with the version of Cassandra/DSE that you're running? Cheers!

0 Likes 0 · ·

have updated the post

0 Likes 0 · ·
smadhavan avatar image smadhavan scherian_188962 ·

@scherian_188962, I don't think I'm seeing the version of C*/DSE being updated in the original post. Where did you update that info?

0 Likes 0 · ·
Show more comments

1 Answer

smadhavan avatar image
smadhavan answered ·

@scherian_188962, based on the assumption that you won't be upserting (a.k.a. editing/updating) the data once written into the table (along with TTL), TimeWindowCompactionStrategy (TWCS) is best suited for this use case. Again, you'll have to test this with your production-like load in a lower environment to gauge which is better strategy for your workloads. For further reading, you can refer to the following resources on choosing the compaction strategy,

3 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

@scherian_188962, you could also "Accept" this answer if you're done. Thanks!

0 Likes 0 · ·

@smadhavan assuming your ans remains the same after the post update on version and data upsert?

0 Likes 0 · ·
smadhavan avatar image smadhavan scherian_188962 ·

@scherian_188962, yes that's correct!

0 Likes 0 · ·