anson avatar image
anson asked Erick Ramirez answered

How does pagination work?

Hi i have a keyspace with more than 1 million records spanning in 3 nodes. I want to use pagination to fetch 1000 records at a time. I went through java driver to see how to implememt and here is the simplest example i could see

Statement stmt = new SimpleStatement("SELECT * FROM images");
ResultSet rs = session.execute(stmt)

What i need to understand is how it is handled internally, i mean a simple statement of 'SELECT * FROM images ' would be hard and it might time out. But paging with 1000 rows at a time solves this. But how are they fetching the first 1000 rows with this paging. We are not exlplicitely providing partition key in the above pagination query. Is it doing the same read path in cassandra?

It would be great if someone could explain the internal path or flow in which pagination occur


10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

The drivers automatically limit the number of rows returned by the server by breaking the results into "pages". If the driver didn't do this and there is too much data, (a) the coordinator will possibly give up waiting for replicas to respond, or (b) the replicas run into an out-of-memory error.

Driver paging works in a similar way that the CQL LIMIT option works except that the driver stores the paging state, effectively a "bookmark" on the last page returned so it can request the "next page" of the result set.

As a side note, the unbounded read in your query is not recommended because it will cause a full table scan where all nodes in the cluster need to be queried to get all the partitions in the table. Unbounded reads will either (c) timeout or (d) overload the nodes.

For more info, see Paging in the Java driver. Cheers!

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.