I am installing a small DSE 6.8 cluster with next topology:
- Datacenter "dc1": 3 Cassandra nodes
- Datacenter "dc2": 2 Analytics Solo nodes
What replication factor for Analytics keyspaces should I use?
Referring to documentation page Setting the replication factor for analytics keyspaces (1) there is a note:
CAUTION: Only replicate DSE Analytics keyspaces to other DSE Analytics datacenters. DSEFS does not support replication to other datacenters, and the dsefs keyspace only contains metadata, not the data stored in DSEFS. Each DSE Analytics datacenter should have its own DSEFS instance.
From this note I understood that analytics keyspaces should be replicated only to datacenters with Analytics workload. Replicating analytics keyspaces to datacenters with other workload (Transactional, Graph, Search) is wrong and should be avoided. Is it right?
Based on above, my RF should be:
ALTER KEYSPACE analytic_keyspace_name WITH REPLICATION = { 'class': 'NetworkTopologyStrategy', 'dc2': '2' };
However, at page "Creating a DSE Analytics Solo datacenter" there is an example Creating a DSE Analytics Solo datacenter within an existing DSE cluster (2) with uses similar topology:
- Datacenter "DC1" - has existing database data
- Datacenter "DC2" - does not store any data but will perform analytics jobs using the database data from DC1
So, in this example "DC1" has Transactional workload and "DC2" has Analytics workload. Is I am wrong here?
This example says to configure analytics keyspaces to replicate to both datacenters:
ALTER KEYSPACE dse_leases WITH REPLICATION = { 'class' = 'NetworkTopologyStrategy', 'DC1' : 3, 'DC2' : 3 };
This makes me confusing as goes against caution note above (Only replicate DSE Analytics keyspaces to other DSE Analytics datacenters).
Can you explain me this case?
Should I replicate my analytic solo keyspaces to my "dc1" which have only transactional workload?
Links: