Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

bharat.asnani_190772 avatar image
bharat.asnani_190772 asked ·

What is the maximum array size allowed in a within() clause?

What is the maximum array size allowed in within clause?

For eg.

g.V().hasLabel("Acc").has("property", within("a","b"))

dsegraph
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

bettina.swynnerton avatar image
bettina.swynnerton answered ·

Hi,

I am assuming this is for DSE Graph 6.8.

Depending on your data model, the query gets executed differently.

It depends whether the property in the within() clause is a partition key or not.


If it is a partition key, this translates to a query like so:

                 
  1. SELECT * FROM friends.person WHERE person_id IN ('person1', ..., 'person1000')

which is a multi-partition query. This type of query will hit one coordinator which then has to keep all queries and responses in its heap, and it will lead to slower read performance and potential coordinator failure. It is a Cassandra anti-pattern and should be avoided where possible.

In your particular case, the limit for the number of parameters that you can pass without significant performance implication will vary on your resources and concurrent workload. In general, it would be better to not use this query pattern extensively.


If the property is not a partition key:

The traversal will not get executed unless you add the allow-filtering option (which is a performance antipattern) or you will need to create a search index to satisfy the query.

The query then translates to a solr query of this type:

                 
  1. SELECT * FROM friends.person WHERE solr_query = '{"q":"*:*", "fq":["name:(name1 OR ... OR name1000)"]}' LIMIT 2147483647

Here the number of parameters that you can pass to Solr with this search query is limited by the maxBooleanClauses setting. It defaults to 1024.

You can increase this setting for your search cores, but it is not advisable. Solr imposes this limit to avoid an overuse of this method. If you need to construct your queries in this way, I would again recommend testing under load to see how your queries perform.

I am not aware of any additional limits in the Gremlin traversal API, let me check for you.

2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi,


Thanks for you response.

The property that we are using inside within is neither a partition key nor search index rather it is a cluster key.


Can you please provide information on this if we use a cluster key.


0 Likes 0 · ·

Hi,

the case of a clustering key property, this is like any other non-partition key property, you would have to index the property, and the number of parameters is limited by the maxBooleanClauses setting.

1 Like 1 · ·