Yeikel.ValdesSantana_186477 avatar image
Yeikel.ValdesSantana_186477 asked Yeikel.ValdesSantana_186477 commented

Can we tokenize the partition key for Graph vertices?

I am trying to build an index for my graph using the following syntax :


Where full_address is the partitionKey for my address vertex.

While the syntax above completes successfully, it does not seem to be creating a tokenized field for full_address and I did not see any errors

<schema name="latest_schema.address_p" version="1.5">
    <fieldType class="org.apache.solr.schema.TrieIntField" name="TrieIntField"/>
    <fieldType class="org.apache.solr.schema.UUIDField" name="UUIDField"/>
    <fieldType class="org.apache.solr.schema.TextField" name="TextField">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
    <fieldType class="org.apache.solr.schema.StrField" name="StrField"/>
    <field docValues="true" indexed="true" multiValued="true" name="~~property_key_id" type="TrieIntField"/>
    <field docValues="true" indexed="true" multiValued="true" name="~~property_id" type="UUIDField"/>
    <field indexed="true" multiValued="false" name="line_1" type="TextField"/>
    <field docValues="true" indexed="true" multiValued="false" name="full_address" type="StrField"/>

As full_address is being indexed as a StrField(non tokenized) instead of the expected TextField(tokenized)

I also tried omitting the type, but it did not make any difference :


Could you please clarify why this is the case? Looking at the documentation[1] , I did not see any information that states that this is not possible so I am not sure why this is happening.


1587574412026.png (34.2 KiB)
1587574428237.png (12.7 KiB)
1587574663483.png (12.2 KiB)
1587574688501.png (22.7 KiB)
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Lewisr650 avatar image
Lewisr650 answered Yeikel.ValdesSantana_186477 commented

I would break this syntax down for ease of coding, maintenance and progressive development efforts breaking it down into Schema creation, index management and data loading. You'll have finer grained control over your development blocks. This will give you guidance on how to break down the stages of Graph Schema development:

3 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Yeikel.ValdesSantana_186477 avatar image Yeikel.ValdesSantana_186477 commented ·

Hi @Lewisr650

I appreciate your response , but I am not sure how it answers my specific question.

I tried to tokenize a partition key in my graph but it does not seem possible to do so with DSE even after it does not give any errors.

0 Likes 0 ·
Lewisr650 avatar image Lewisr650 Yeikel.ValdesSantana_186477 commented ·

DSE gives you the ability to work in a multi-model mode. DSE has Lucene integration via Solr. You can modify the Solr schema to achieve your results. Breaking it down into stages will give you greater control rather than applying default settings with the approach you are taking.

0 Likes 0 ·
Yeikel.ValdesSantana_186477 avatar image Yeikel.ValdesSantana_186477 Lewisr650 commented ·

I understand I can modify the Solr schema , but I also need it to be sync with my graph.

I thought about using the "copy to" functionality if the partition key cannot be tokenized. I am still wondering why this is the case when it does not fail or give any warnings.

0 Likes 0 ·