question

jmirick_83131 avatar image
jmirick_83131 asked jmirick_83131 commented

Need CQL query that returns rows starting with specified row

Given the following legacy table:

create table "MyKeyspace"."CkeyTest" (
  accountid bigint, 
  descbigint1 bigint, 
  ascbigint2 bigint, 
  ascstring1 ascii,
  value text, 
PRIMARY KEY ((accountid), descbigint1, ascbigint2, ascstring1))
WITH CLUSTERING ORDER BY (descbigint1 DESC, ascbigint2 ASC, ascstring1 ASC) 
AND COMPACT STORAGE;

And 15 rows inserted such that:

SELECT accountid, descbigint1, ascbigint2, ascstring1, value 
FROM "MyKeyspace"."CkeyTest"
WHERE accountid=111;

returns:

accountid, descbigint1, ascbigint2, ascstring1, value
===================================================
111,262,325,cherry,red
111,262,333,cherry,red
111,252,324,kiwi,green
111,252,343,kiwi,green
111,242,323,banana,yellow
111,242,353,banana,yellow ***
111,232,322,apple,green
111,232,363,apple,green
111,222,333,apple,green
111,222,333,banana,yellow
111,222,333,cherry,red
111,222,333,kiwi,green
111,222,343,kiwi,green
111,222,353,banana,yellow
111,222,363,apple,green

What is the CQL that will give me rows starting with row *** and the ones following, i.e.:

accountid, descbigint1, ascbigint2, ascstring1, value
====================================================
111,242,353,banana,yellow ***
111,232,322,apple,green
111,232,363,apple,green
111,222,333,apple,green
111,222,333,banana,yellow
111,222,333,cherry,red
111,222,333,kiwi,green
111,222,343,kiwi,green
111,222,353,banana,yellow
111,222,363,apple,green

I think the first clustering column being DESC is messing me up. The closest I've come is the following query but it returns rows starting with clustering column 242:

SELECT accountid, descbigint1, ascbigint2, ascstring1, value 
FROM "MyKeyspace"."CkeyTest" 
WHERE accountid = 111 AND 
(descbigint1) <= (242) AND
(descbigint1,ascbigint2,ascstring1) >= (1,353,'banana');

accountid, descbigint1, ascbigint2, ascstring1, value
====================================================
111,242,323,banana,yellow
111,242,353,banana,yellow ***
111,232,322,apple,green
111,232,363,apple,green
111,222,333,apple,green
111,222,333,banana,yellow
111,222,333,cherry,red
111,222,333,kiwi,green
111,222,343,kiwi,green
111,222,353,banana,yellow
111,222,363,apple,green

Script to create table and insert data CkeyTest.txt

If I insert the same data with all clustering columns in ASC order, I seem to have no issue querying with any of the rows as a starting row.

Note I'm able to do this kind of query with Hector. I'm trying move our REST API from Hector to CQL and our paging relies on a "nextkey" comprised of serialized clustering columns (Composite column name for those familiar with Hector) as a starting point. I'm trying to maintain this paging if I can as we have several tables with "mixed order" clustering column like above and we'll be in a mixed Hector/CQL mode for awhile as we roll this out.

Lastly, I really want to avoid filtering client side as our tables may have thousands of rows with the same value in the first clustering column. Any help is appreciated!

cql
ckeytest.txt (2.3 KiB)
3 comments
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

jmirick_83131 avatar image jmirick_83131 commented ·

Fixed typo in initial post (sample result column name 'ascbigint1' -> 'descbigint1' - hopefully this allows the post to be taken out of "awaiting moderation" soon!

0 Likes 0 ·
steve.lacerda avatar image steve.lacerda ♦ commented ·

Can you give an example of what you would like to see? Based on your clustering key and order:

descbigint1 DESC, ascbigint2 ASC, ascstring1 ASC

The result that you're getting is what I'd expect. You'll get the results with descbigint1 DESC, then from those results, ascbigint2 will be ASC, and then based on those results ascstring1 will be ASC.

0 Likes 0 ·
jmirick_83131 avatar image jmirick_83131 steve.lacerda ♦ commented ·

Say that I did the following query to get the first 5 rows:

SELECT accountid, descbigint1, ascbigint2, ascstring1, value 
FROM "MyKeyspace"."CkeyTest"
WHERE accountid=111 LIMIT 5;

accountid, descbigint1, ascbigint2, ascstring1, value
===================================================
111,262,325,cherry,red
111,262,333,cherry,red
111,252,324,kiwi,green
111,252,343,kiwi,green
111,242,323,banana,yellow

What is the query that would give me the next 5, e.g.:

accountid, descbigint1, ascbigint2, ascstring1, value
====================================================
111,242,353,banana,yellow ***
111,232,322,apple,green
111,232,363,apple,green
111,222,333,apple,green
111,222,333,banana,yellow

I can do this in Hector with a range query starting with the *** row, which I know because I previously queried 6 rows and captured the 6th as my starting point. I'm trying to emulate this in CQL to maintain our paging logic as we move from Hector to CQL.

0 Likes 0 ·
Erick Ramirez avatar image
Erick Ramirez answered jmirick_83131 commented

There isn't a valid CQL that would allow you to retrieve the set of rows you want, just the specific row (not a set).

Your question didn't make much sense to me initially since I couldn't come up with a scenario where it's possible to know the specific row up front unless you've already iterated over the preceding rows. However I gathered from your follow-up comments that you were in fact iterating through the results and just wanted to get to the next set of rows in which case paging with the Java driver is definitely what you need. Cheers!

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

jmirick_83131 avatar image jmirick_83131 commented ·
ok thank you both @steve.lacerda and @Erick Ramirez !
0 Likes 0 ·
steve.lacerda avatar image
steve.lacerda answered jmirick_83131 commented

Thanks for the update. What you need is paging. I'm not sure which type and/or version of the driver you're using, but since Java is the most used, you can see how that's done here:

https://docs.datastax.com/en/developer/java-driver/4.13/manual/core/paging/

There are some examples of synchronous and asynchronous paging at the above link. Each type of driver has something similar.

3 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

jmirick_83131 avatar image jmirick_83131 commented ·
Yes, since it took about a week for my post to "clear" the moderator(!), I actually started going down the route of implementing paging using the Java Driver. I was hoping to avoid this, as I will need to support both types of paging in our API as we move all of our queries from Hector to CQL.


I'm going to defer "Accepting" this answer, as I still would like to know if such a CQL query is possible. I am probably not the only one in this predicament, though I know we are dwindling ;-) Hector was (is) really powerful and if you used it from the beginning and exploited Composite Column Names with DESC/ASC order Comparators, you want to know what your options are. Thank you for responding!

0 Likes 0 ·
steve.lacerda avatar image steve.lacerda ♦ jmirick_83131 commented ·

Unfortunately, I'm not aware of anything that can do what you're suggesting except for paging. Plain cql does not have anything like a start/end option, unless you're using solr and then you could use that type of feature.

0 Likes 0 ·
jmirick_83131 avatar image jmirick_83131 steve.lacerda ♦ commented ·

Range queries DO appear to work if you properly qualify higher level keys, etc See https://docs.datastax.com/en/dse/5.1/cql/cql/cql_using/whereClustering.html and https://www.datastax.com/blog/deep-look-cql-where-clause. All examples feature clustering keys in the same order, e.g., all ASC. I think that's where my use case breaks down. If instead I create a different table "AscCkeyTest" with the attached scripts, with the "descbigint1" clustering key changed to ASC, queries like the following appear to do the right thing:

select account_id, descbigint1, ascbigint2, ascstring1, value FROM "MyKeyspace"."AscCkeyTest" WHERE
account_id=111 AND
(descbigint1, ascbigint2, ascstring1) >= (222,333,'cherry');

As I said in the original post, I tried to accommodate the first clustering key in DESC order by qualifying it separately and with a range featuring the "lowest value" - close but not close enough.

0 Likes 0 ·