PLANNED MAINTENANCE

Hello, DataStax Community!

We want to make you aware of a few operational updates which will be carried out on the site. We are working hard to streamline the login process to integrate with other DataStax resources. As such, you will soon be prompted to update your password. Please note that your username will remain the same.

As we work to improve your user experience, please be aware that login to the DataStax Community will be unavailable for a few hours on:

  • Wednesday, July 15 16:00 PDT | 19:00 EDT | 20:00 BRT
  • Thursday, July 16 00:00 BST | 01:00 CEST | 04:30 IST | 07:00 CST | 09:00 AEST

For more info, check out the FAQ page. Thank you for being a valued member of our community.


question

bharat.asnani_190772 avatar image
bharat.asnani_190772 asked ·

Is there a limit to adding multiple vertices and edges in a single graph traversal?

Hi,

I am using Dse graph 6.8 and adding multiple vertices and edges in a single graph traversal. For eg.

g.addV("label"). property ("key", "value").as("a")
    .addV("label"). property ("key", "value").as("b")
    ----------- more vertices
    .addE("label"). from("a").to("b").property ("key", "value")
    ---------- more edges

I need to know is there any limit on number of vertices and edges added in one traversal?

Any help is appreciated.

dsegraphgremlin
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

bettina.swynnerton avatar image
bettina.swynnerton answered ·

Hi,

I am updating my original answer based on the various comments.

A graph traversal is composed of an ordered list of steps, separated by dots (.).

In DSE Graph 6.8 there is a hard limit on the "length" of the traversal, i.e the number of steps in the traversal. It is limited to 90. This is a hard limit and it is not configurable.

If you were to string several addV() steps together like in your example, you would run into this limit already with 31 vertices.

I am pasting the resulting error below:

gremlin> g.addV('person').property('person_id', 'person0').as('0').addV('person').property('person_id', 'person1').as('1').
.
.
.
.addV('person').property('person_id', 'person29').as('29').addV('person').property('person_id', 'person30').as('30')
The submitted traversal exceeded the maximum length allowed of '90'. Please split it into multiple smaller traversals.

You will still run into this limit when using Fluent API, albeit a bit later. When sending the traversal through fluent API, the traversal is converted to bytecode first, and the steps are then apparently counted differently than when going through script API (or Gremlin console). Step modulators such as property() and as() steps seem to not get counted as steps in this case. However, the addV() or addE() steps are contribute to the count, and you will still run into the max traversal length limit of 90 if updating close to 90 elements.

The steps counted from the bytecode traversal seem to include a couple of extra steps, so we don't get quite to the full 90 elements. I have tested this and managed to update 87 vertices before hitting the limit.

I would advise that you test where exactly the limit lies with your specific update query, but it will most certainly be below 90 updates.

For some additional information about the fluent API, see this blog.


The following approach could be used for updating more graph elements with one Gremlin traversal.

This is heavily based on the examples from this page: https://tinkerpop.apache.org/docs/current/recipes/#long-traversals

Create your data first (here for 100 vertices):

persons = (1..100).collect {["person_id": "person_${it}", "name": "name_${it}", "age": it + 20]}

Inject the data into the traversal and process them as a side effect. This now inserts 100 vertices.

g.inject(persons).sideEffect(
   unfold().
   addV("person").property("person_id", select("person_id")).property("name", 
   select("name")).property("age", select("age"))
).iterate()

The number of operations per traversal is limited, but configurable, as the following error shows:

gremlin> g.inject(persons).sideEffect(unfold().addV("person").property("person_id", select("person_id")).property("name", select("name")).property("age", select("age"))).iterate()
Maximum number of operations (10000) exceeded.
Possible solutions to this are:
 - break your traversal into chunks
 - up the maximum number of mutations by adding g.with("max-mutations", N) at the beginning of your traversal

However, packing these updates into a single traversal is no guarantee for all-or-nothing updates. The above example scaled up to 10000 inserts failed due to timeouts, resulting in dropped mutations and therefore in a partial update. Breaking the inserts into several traversals seems to be a better option in this case.

9 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

If I will add vertex one by one then it will make n number of calls to server for n number of vertices. Also separation of each dot (.) is a step. Right?


Also is there any better approach to add multiple vertices and edges in a single server call?

0 Likes 0 · ·

Sorry, yes, a graph traversal is composed of an ordered list of steps, separated by dots (.).

Do you want to add them in a single traversal to keep the mutations within one transaction?

You could try something like the following example:

Create your data first (here for 100 vertices):

persons = (1..100).collect {["person_id": "person_${it}", "name": "name_${it}", "age": it + 20]}

Inject the data into the traversal and process them as a side effect. This now inserts 100 vertices.

g.inject(persons).sideEffect(
   unfold().
   addV("person").property("person_id", select("person_id")).property("name", 
   select("name")).property("age", select("age"))
).iterate()

See here for more examples, including for adding edges: https://tinkerpop.apache.org/docs/current/recipes/#long-traversals

0 Likes 0 · ·
bettina.swynnerton avatar image bettina.swynnerton ♦♦ bettina.swynnerton ♦♦ ·

I realise that this leaves your initial question regarding the limit for mutations unanswered.

Another test with a larger list revealed the following limit:

gremlin> g.inject(persons).sideEffect(unfold().addV("person").property("person_id", select("person_id")).property("name", select("name")).property("age", select("age"))).iterate()
Maximum number of operations (10000) exceeded.
Possible solutions to this are:
 - break your traversal into chunks
 - up the maximum number of mutations by adding g.with("max-mutations", N) at the beginning of your traversal

I'll research the maximum number of mutations more and will update this post.

1 Like 1 · ·
Show more comments

Hi,

The lack of transaction beyond 100 is a blocker for us to use Graph DB.

In our use case, we have a vertex with possibly upto 10k edges that should be updated in single transaction.

Any other alternatives?

0 Likes 0 · ·

Hi,

understood, I will look into best practices in this case and will let you know what I find.

0 Likes 0 · ·
baid_manish_187433 avatar image baid_manish_187433 bettina.swynnerton ♦♦ ·

Thanks Bettina. Do let us know the recommendations.

0 Likes 0 · ·
Show more comments