Oliverd98 avatar image
Oliverd98 asked Erick Ramirez answered

Need help with tombstone issue on table with default TTL

I have created one test cluster and created table which has default_time_to_live = 3600.
But when i am reading data from the table getting below warning in logs . - Read 34 live rows and 62500 tombstone cells for query SELECT * FROM mytestks.login_token WHERE dept_id = 0 AND user_id = 1 LIMIT 5000 (see tombstone_warn_threshold)

Shall i change compaction method or change gc grace period in order to solve the same?

Due to this i am not getting faster results could you please guide me to solve?

Please find table desc below for your reference.

) WITH CLUSTERING ORDER BY (user_id ASC, token_id ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': ''}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 3600
    AND gc_grace_seconds = 345600
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND speculative_retry = '99PERCENTILE';
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

Unfortunately, there isn't a quick fix for your specific problem.

Based on the information you provided, you are using Cassandra as a queue store. I imagine users login to your app and it generates a token that's valid for an hour.

Queues are anti-pattern for Cassandra because of their high delete workload. It's discussed in detail in this blog post -- Cassandra anti-patterns: Queues and queue-like datasets.

Generally this isn't a problem but the in the example above, your test user has generated 63K tokens. With a GC grace of 4 days, that's the equivalent of logging in 650+ times per hour and you end up with excessive amounts of expired tokens.

Part of the problem with your usage is that the clustering column is the token ID itself sorted in ascending order. This means that depending on the data type of the token ID, you are iterating over rows and rows of expired tokens to get to the valid tokens.

If you instead cluster your data with some time-based column and sort it in descending order, you know that any token older than one hour can be ignored. Here's an example table definition to illustrate what I mean:

CREATE TABLE tokens_by_user (
  user_id text,
  generated_time timestamp,
  token_id uuid,
  PRIMARY KEY (user_id, generated_time)

You can then query this table for the valid tokens for a user if you filter with a generated time that is 1 hour or less. Since the generated time is in descending order, the latest tokens appear first and you won't be iterating over the expired tokens. Cheers!

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.