I am getting a warning messages in the server logs as shown in the gist file below. Is it something that I should be concerned since it is just a warning?
https://gist.github.com/ortizfabio/06b97dab6ba377ce898d37f549a3543b
EDIT: Yes mutation are large because I increased the commit_log_segment_size to 124MB so I can write up 64MB. I am trying to write from spark a large amount of rows as fast as possible. It seems a daunting task in Cassandra. I should probably open a new thread and ask the question there. I am sending 100 rows of up to 2Kb with 3 concurrent writes at speed of 0.01MB/sec using 100 partitions.
If I reduce the speed it does not error out but it takes hours. My cluster has 8 brokers and they are mirror in two datacenters. table is very simple it has the following columns:
id bigint, type_code text, ver_nb bigint, detail_json text, cre_ts timestamp, cre_user text, last_upd timestamp, PRIMARY KEY (id,type_code, ver_nb)
UPDATE (Feb 24 19:00 UTC):
I finally got the job to finish without overriding the buffer and causing a mutation or taking forever. Here are the stats:
I am inserting 13 million rows with a total size of 13GB on a cluster with 8 brokers. To insert this using the cassandra-spark-connector with 100 partitions it took about 2 hours.
My Spark connector configuration is:
spark.cassandra.output.consistency.level = "LOCAL_ONE" spark.cassandra.output.concurrent.writes = "5" spark.cassandra.output.batch.grouping.buffer.size = "10" spark.cassandra.output.batch.size.rows = "1" spark.cassandra.output.batch.grouping.key = "partition" spark.cassandra.output.throughput_mb_per_sec = "0.01"
I wish I could write a lot faster but when I increase the parameter spark.cassandra.output.throughput_mb_per_sec then I get the mutation. If I had increased the commit_log_segment_size then the speed would have to be lower. A picture of the system can be seen below there is an initial bump to 55K transactions per minute at start and then and steady 25K/min.