What is the use of replication factor of 3 when running Cassandra on my laptop?

I have downloaded Apache Cassandra onnmy local computer and I have started running it on windows Command prompt.

Without any network connection, I started Cassandra on CMD and I have created a key space, specified replication Strategy as "networkStrategy" and RF = 3.

What does that means? Whether my data will be replicated thrice on my local computer or distributed ?

1 Answer

Hi @chandrasekar.b03_190734,

it sounds to me that you only have one node in your cluster.

You can confirm the topology of your cluster by running a nodetool status on your Cassandra node.

For a single node cluster, the concept of replication does not make much sense. This node will have all the data, and there will only be one copy of the data. If the node is down, no node will service any requests.

Increasing the replication factor beyond your number of nodes will not result in more replicas being stored. But it will create some weird errors when reading and writing with consistency levels that are greater than the number of nodes.

Here is an example:

On a single node cluster, I created a keyspace with replication factor 42:

CREATE KEYSPACE huge WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '42'}  AND durable_writes = true;

I added a simple table:

CREATE TABLE huge.table1 (
    id text PRIMARY KEY,
    data text

and I successfully inserted some data:

cqlsh> insert into huge.table1 (id, data) values ('id1', 'data1');
cqlsh> select * from huge.table1 ;

 id  | data
 id1 | data1

(1 rows)

This works, because cqlsh uses a consistency level of LOCAL ONE by default.

Now, if I try to write with consistency ALL:

cqlsh> consistency all
Consistency level set to ALL.

cqlsh> insert into huge.table1 (id, data) values ('id2', 'data2');
NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: MyDataCenter>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level ALL" info={\'required_replicas\': 42, \'alive_replicas\': 1, \'consistency\': \'ALL\'}',)})

By allowing a replication factor greater than the number of nodes, you are creating a scenario where most consistency levels would fail.

I hope this helps.

