Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

yangli136_124274 avatar image
yangli136_124274 asked Erick Ramirez answered

How does Cassandra manage TTL for Collections and UDT?

Cassandra does not support TTL function on collections, and returns null when apply TTL function on non-frozen UDT.

But "using ttl nnn" is able to be used to insert or update collections and non-frozen UDT.

questions:

1. are ttl values obeyed for collections and non-frozen UDT?

2. Why TTL function is not able to apply to collections

3. Why TTL function returns null for non-frozen UDT?

user-defined typecollectionsttl
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

Data expiration on elements of a collection or items in a UDT are handled in the same way as other TTLd data in Cassandra. But there is a difference in how you can query the expiration date/time with the TTL() function because of the underlying structure of collections and UDTs.

Let me respond to your questions directly.

1. are ttl values obeyed for collections and non-frozen UDT?

Yes, they are. To illustrate with an example, here's the schema for my example table:

CREATE TABLE community.friends_by_user (
    user text PRIMARY KEY,
    friends set<text>
)

Here's an example user I've inserted with a TTL:

cqlsh:community> INSERT INTO friends_by_user (user, friends)
                   VALUES ( 'alice', {'bob', 'charlie', 'dianne'})
                   USING TTL 300;
cqlsh:community> SELECT * FROM friends_by_user ;

 user  | friends
-------+------------------------------
 alice | {'bob', 'charlie', 'dianne'}

If we inspect the SSTable for this partition, it shows that the expiration is set for 5 minutes (USING TTL 300) so the data will expire as expected:

[
  {
    "partition" : {
      "key" : [ "alice" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 19,
        "liveness_info" : { "tstamp" : "2021-06-01T02:29:37.604229Z", "ttl" : 300, "expires_at" : "2021-06-01T02:34:37Z", "expired" : false },
        "cells" : [
          { "name" : "friends", "deletion_info" : { "marked_deleted" : "2021-06-01T02:29:37.604228Z", "local_delete_time" : "2021-06-01T02:29:37Z" } },
          { "name" : "friends", "path" : [ "bob" ], "value" : "" },
          { "name" : "friends", "path" : [ "charlie" ], "value" : "" },
          { "name" : "friends", "path" : [ "dianne" ], "value" : "" }
        ]
      }
    ]
  }
]

Specifically, data was written at 02:29:37 and expiring at 02:34:37:

        "liveness_info" : { "tstamp" : "2021-06-01T02:29:37.604229Z", "ttl" : 300, "expires_at" : "2021-06-01T02:34:37Z", "expired" : false },

2. Why TTL function is not able to apply to collections

It is not possible to use the TTL() function on collections because you can only query the collection as a whole, not individual elements so it isn't possible to query the expiration of individual elements.

The same applies to querying the TTL of the whole collection. Again, let me illustrate with an example by adding a new friend to the collection:

cqlsh:community> UPDATE friends_by_user
                   USING TTL 3600
                   SET friends = friends + {'erin'}
                   WHERE user = 'alice';

If we inspect the SSTable, we now have a new friend erin added with its own TTL:

[
  {
    "partition" : {
      "key" : [ "alice" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 19,
        "liveness_info" : { "tstamp" : "2021-06-01T02:29:37.604229Z", "ttl" : 300, "expires_at" : "2021-06-01T02:34:37Z", "expired" : false },
        "cells" : [
          { "name" : "friends", "deletion_info" : { "marked_deleted" : "2021-06-01T02:29:37.604228Z", "local_delete_time" : "2021-06-01T02:29:37Z" } },
          { "name" : "friends", "path" : [ "bob" ], "value" : "" },
          { "name" : "friends", "path" : [ "charlie" ], "value" : "" },
          { "name" : "friends", "path" : [ "dianne" ], "value" : "" },
          { "name" : "friends", "path" : [ "erin" ], "value" : "", "tstamp" : "2021-06-01T02:30:29.413237Z", "ttl" : 3600, "expires_at" : "2021-06-01T03:30:29Z", "expired" : false }
        ]
      }
    ]
  }
]

Specifically this line:

          { "name" : "friends", "path" : [ "erin" ], "value" : "", "tstamp" : "2021-06-01T02:30:29.413237Z", "ttl" : 3600, "expires_at" : "2021-06-01T03:30:29Z", "expired" : false }

Since we can set the TTL on a single element, it isn't possible to query the expiration of the entire collection because it will contradict the expiration of individual elements. For this reason, the use of the TTL() function on collections is not allowed.

3. Why TTL function returns null for non-frozen UDT?

The same reason applies to non-frozen UDTs. Since it is possible to set the TTL on a single UDT element, it isn't possible to query the expiration of the entire [non-frozen] UDT column because it will contradict the expiration of individual elements. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.