Suppose I write JSON documents to Astra using the document API. I want to read the data using CQL afterwards. Want to know if this is possible, and what in what format is the data stored.
Consider these two points:
1. Document API has a new concept of namespaces for holding collections. This is on top of Cassandra's keyspaces. How are namespaces implemented?
2. Is Document API schema-less where arbitrary documents with different schemas can be written? If yes, then how is the data stored? For example I want to write two documents:
{ "name": "jack" }
and
{ "name": [ "first": "john", "last": "doe" ] }
[UPDATE] I wrote the documents from above into a collection "my_collection". Then queried them using CQL:
SELECT * FROM gg.my_collection;
This is how the data is stored (skipped p1 to p44 columns for they are empty):
index | key | p0 | bool_value | dbl_value | leaf | text_value |
---|---|---|---|---|---|---|
0 |
27882e52-76a4-4a06-90e7-ec81d7c3443e |
data |
data |
{"name": "jack"} |
||
1 |
9ec92cd1-046e-4a2b-9c97-1392104b30a0 |
data |
data |
{"name": "jack", "city": "mumbai"} |
||
2 |
dbc63e19-a13e-4688-8be1-b7ac0c66a336 |
data |
data |
{ "name": { "first": "john", "last": "doe" } } |
Also tested this the other way round, creating tables using CQL and then fetching data using document API.
1. The table names are returned by the get collections API.
2. Get documents API does not work because column schemas don't match.
As per my understanding
1. The document UID is stored as `key`.
2. The unstructured JSON has been flattened into `text_value` column.
3. Namespace has been mapped to keyspace and collection name to table name.