Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started



thaison avatar image
thaison asked thaison commented

What is the optimal way of fetching data that traverses 10K+ edges and vertices?


I would like to traverse all out edges and collect data along the path (from edge and vertex). It works fine for small number of edges but it runs into time outs when the edge count is high (10K).

The traversal is something like:

g.V(id)//single source vertex
    .outE("uses")//10K edges

Does anyone know of a faster traversal to get the same result or any suggestions?

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

jeromatron avatar image
jeromatron answered thaison commented

Without changing the traversal, you should be able to stream results. That would at least avoid a timeout. You can stream results with both the Apache TinkerPop drivers to gremlin server with DSE Graph or with 6.8's core graph you can stream results with the DataStax specific driver functionality. Would that be what you're looking for?

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks @jeromatron . This is how we are executing and getting the results. Is this what you mean by streaming the results or is there another way?

statement = FluentGraphStatement.newInstance(traversal);
CqlSession cqlSession = dseSession.getSession();
GraphResultSet resultSet = cqlSession.execute(statement);
if(resultSet.iterator().hasNext())  {
    resultSet.forEach((node) -> {
        String label = node.getByKey(T.label) != null ? node.getByKey(T.label).asString() : null;
        //add the result node
        results.accept(new ResultNode(label, node, true, resultSet.iterator().hasNext()));
0 Likes 0 ·