Data modeling question regarding the use of edge properties versus a new edge to support "Time traversals" (SubgraphStrategy).
In our graph, we have the following schema :
Person(vertex) => has_phone(edge) => Phone(vertex)
Where the has_phone
edge has the following properties :
create_timestamp
end_timestamp
(to denote an "old phone")update_timestamp
The reason we implemented timestamps is that we'd like to support queries like "Show me the phone numbers that this person had" and because we'd like to know when that relationship ended, but we do not have intentions to ask questions like "Show me the phone numbers that this person had in 2019".
In general, we will have the following queries.
- Show the phones that this person has at the moment (80-90% of our queries)
- Show me the phones that this person had at some point
- 1&2 together
With that in mind, our initial idea was to use "Time traversals"[1] using a SubgraphStrategy.
But I was wondering if that's how you normally implement this considering the cardinality of the timestamps , the possibility of having to create an index to support the workload and the type of queries that we are planning to create (as I described, 80-90% will be about the current state of the graph and not the past)
In that sense, I was wondering if this makes more sense :
Person(vertex) => has_phone(edge) => Phone(vertex)
- for all the current phones
Person(vertex) => had_phone(edge) => Phone(vertex)
- for all the "past" phones
if it helps, we are using DSE graph 6.0.4
[1] https://www.datastax.com/blog/2016/09/gremlins-time-machine