question

JoshPerryman avatar image
JoshPerryman asked Erick Ramirez edited

Set GraphTraversalSource OLAP settings with Java API?

In my DSE Graph (6.0.9) integration tests I'm using the Java Driver (1.8) to setup both OLTP (g) and OLAP (a) GraphTraversalSources. I use the fluent API with both throughout the test class.


The problem I'm running into is that the default settings for the OLAP GTS (a) have it taking up all of my local cores and about half of my RAM, when it should be fine with 1 - 2 cores and a couple of GB. What's worse, the resulting Spark application ("Apache TinkerPop's Spark-Gremlin") ends up being long-running and I have to go into the Spark Master console to kill it if I want to do anything else with my local Spark.


Is there any way to configure the recommended spark properties (https://docs.datastax.com/en/dse/6.0/dse-dev/datastax_enterprise/graph/graphAnalytics/graphAnalyticsSparkGraphComputer.html) when setting up my remote traversal sources with the Java API? So far all I can find is:


a = DseGraph.traversal(dseSession, 
       new GraphOptions().
           setGraphName(dseGraphName).
           setGraphSource("a"));


Or am I going to have to switch to the 2.x Java driver and manage these settings in the config with profiles?

java drivergraph
6 comments
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

jlacefield avatar image jlacefield commented ·

I don't think the Java 2.0 driver supports this either right now. Just created an internal JIRA for this.

2 Likes 2 ·
JoshPerryman avatar image JoshPerryman jlacefield commented ·

Currently, my only use case are aggregations used in the integration tests (e.g. a.V().count()). As a work-around I can set allow_scan = true and do the same thing with the transactional GraphTraversalSource, but I'd prefer to use the Gremlin OLAP.

0 Likes 0 ·
jlacefield avatar image jlacefield commented ·

Hello... good feedback. Will get this back to our devs to see what we can do.

1 Like 1 ·
JoshPerryman avatar image JoshPerryman jlacefield commented ·

Thanks Jonathan. I wasn't planning to jump to the 2.x driver for a couple of months, but I've been getting us ready to go in that direction and if that's the only option, we can accelerate that plan.

0 Likes 0 ·
Aleks Volochnev avatar image Aleks Volochnev ♦ commented ·

Am I right you want to do with Java API the same as with Gremlin at the pic?

GraphOptions is not the way, it doesn't support this kind of options.

0 Likes 0 ·
1565686334233.png (54.9 KiB)
JoshPerryman avatar image JoshPerryman Aleks Volochnev ♦ commented ·

Yes, that's the functionality which I need. And yes, I've discovered that there isn't a way to do that with the public GraphOptions API. So the question remains: how can I do it with the Java API using the 1.8 driver? Is there any other option aside from upgrading the driver?

0 Likes 0 ·

1 Answer

jlacefield avatar image
jlacefield answered JoshPerryman commented

Josh, After working with our devs, it looks like we're actually changing/improving the way global spark configurations can be made in the new version of graph. The change will be to deprecate user provided settings in favor of using a property file to control global spark settings, like resource allocation. The purpose in doing this is to reduce user confusion as the settings that are exposed through the spark config API are really global and impact all users of the Spark Gremlin OLAP service. Also, today, after a property is changed a restart of Spark is required to ensure the property takes effect. This is not well known and causes confusion for our end users.


This is a long way of saying, we're actually going to deprecate making global sparkcontext settings through any graph APIs in favor of a new property file and dsetool approach that will be available soon in Labs.

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

JoshPerryman avatar image JoshPerryman commented ·

Thanks for the update Jonathan. I'll switch from Gremlin OLAP to setting allow_scans = true, which is fine for now. Sounds like we've got another reason to be looking forward to the next release.

0 Likes 0 ·