Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

Yeikel.ValdesSantana_186477 avatar image
Yeikel.ValdesSantana_186477 asked ·

Can we tokenize the partition key for Graph vertices?

I am trying to build an index for my graph using the following syntax :

schema.vertexLabel("address").index("search").search().by("line_1").asText().by("full_address").asText().add()

Where full_address is the partitionKey for my address vertex.

While the syntax above completes successfully, it does not seem to be creating a tokenized field for full_address and I did not see any errors

<schema name="latest_schema.address_p" version="1.5">
  <types>
    <fieldType class="org.apache.solr.schema.TrieIntField" name="TrieIntField"/>
    <fieldType class="org.apache.solr.schema.UUIDField" name="UUIDField"/>
    <fieldType class="org.apache.solr.schema.TextField" name="TextField">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>
    <fieldType class="org.apache.solr.schema.StrField" name="StrField"/>
  </types>
  <fields>
    <field docValues="true" indexed="true" multiValued="true" name="~~property_key_id" type="TrieIntField"/>
    <field docValues="true" indexed="true" multiValued="true" name="~~property_id" type="UUIDField"/>
    <field indexed="true" multiValued="false" name="line_1" type="TextField"/>
    <field docValues="true" indexed="true" multiValued="false" name="full_address" type="StrField"/>
  </fields>
  <uniqueKey>(full_address)</uniqueKey>
</schema>

As full_address is being indexed as a StrField(non tokenized) instead of the expected TextField(tokenized)

I also tried omitting the type, but it did not make any difference :

schema.vertexLabel("address").index("search").search().by("full_address").by("line_1").asText().add()

Could you please clarify why this is the case? Looking at the documentation[1] , I did not see any information that states that this is not possible so I am not sure why this is happening.

[1] https://docs.datastax.com/en/dse/6.0/dse-dev/datastax_enterprise/graph/using/useSearchIndexes.html

dsegraph
1587574412026.png (34.2 KiB)
1587574428237.png (12.7 KiB)
1587574663483.png (12.2 KiB)
1587574688501.png (22.7 KiB)
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Lewisr650 avatar image
Lewisr650 answered ·

I would break this syntax down for ease of coding, maintenance and progressive development efforts breaking it down into Schema creation, index management and data loading. You'll have finer grained control over your development blocks. This will give you guidance on how to break down the stages of Graph Schema development: https://docs.datastax.com/en/dse/6.8/dse-dev/datastax_enterprise/graph/using/manageSchemaTOC.html

3 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi @Lewisr650

I appreciate your response , but I am not sure how it answers my specific question.

I tried to tokenize a partition key in my graph but it does not seem possible to do so with DSE even after it does not give any errors.

0 Likes 0 · ·
Lewisr650 avatar image Lewisr650 Yeikel.ValdesSantana_186477 ·

DSE gives you the ability to work in a multi-model mode. DSE has Lucene integration via Solr. You can modify the Solr schema to achieve your results. Breaking it down into stages will give you greater control rather than applying default settings with the approach you are taking.

0 Likes 0 · ·

I understand I can modify the Solr schema , but I also need it to be sync with my graph.

I thought about using the "copy to" functionality if the partition key cannot be tokenized. I am still wondering why this is the case when it does not fail or give any warnings.

0 Likes 0 · ·