Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started



swang_148208 avatar image
swang_148208 asked swang_148208 commented

Is there a way of changing commitlog_segment_size_in_mb without restarting Cassandra?

Hi, guys, I have met an issue:
While I add a new node into cassandra cluster, I met the following in debug.log:

java.lang.IllegalArgumentException: Mutation of 85.503MiB is too large for the maximum size of 16.000MiB

So I set commitlog_segment_size_in_mb=256
And the problem is solved.
However, this cause cassandra restart and I had to restore the parameters after the operation was completed, and cassandra will be rebooted again.
That's not what I want.
Is there any other good idea?

Thanks. Regards

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez edited

@swang_148208 Unfortunately, there isn't a way of changing the size of commit log segments without having to restart Cassandra.

As a side note, there is a reason why the maximum mutation size is 16MB. 85 MB mutations are excessively large. Even a 10MB mutation is large and problematic. We recommend you work with your application team to determine why you have really large mutations. It's an indicator that you might need to review your data model as it can result in more problems further down the line. Cheers!

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Cedrick Lunven avatar image
Cedrick Lunven answered swang_148208 commented

Hello I would +1 Erick comment, at the server level, if you cannot insert into the commit log file there is no way to know which data is concern.

At the application level there is something wrong - period. Either the data model is bad, either they did not understand what a batch should be used for in goth case this is not at Cassandra level that you must act. They can also enable some metrics recording at the driver level.

If you raise the number, they will starting telling you Cassandra is slow.

Now to do a rolling restart of your cluster you can use CSTAR, (no downtime)

3 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Hi, @Cedrick Lunven and @Erick Ramirez,

I finally found a way to add new NODE without modifying the parameter commitlog_segment_size_in_mb.

When the error

java.lang.IllegalArgumentException: Mutation of <X> MiB is too large for the maximum size of 16.000MiB

occurs, excute the command:

nodetool netstats

Check whether there is any data migration from the existing NODEs to the newly added NODE.

If the data migration has stopped, then restart the newly added NODE. When this newly added NODE has been restarted, the process of data migration will be restarted. In this way, restart the newly added NODE for multiple times, and each time will always migrate part of the data to the newly Added NODE, so that the data between NODEs will gradually distribute correctly.

This method is certainly not so good, but it does solve the trouble we encountered, that is, add a new NODE without restarting the existing NODEs.

0 Likes 0 ·

Since I encountered this error:

Mutation of <X> MiB is too large for the maximum size of 16.000MiB

in the case of adding a new NODE, not in our application, so I suspect that the process of adding NODE is this:

1) When adding a new NODE, Cassandra will move part of the data to the new NODE

2) In the process of migrating data, the data is written block by block, and the size of the block is random. If the block is less than 16M, this part of the data migration is successful. If the block is greater than 16M, failed, and data migration stopped

3) After reboot newly Added NODE, the data migration process restarts again. Due to the randomness of the migrated data blocks, the last failed block may be divided into smaller blocks less than 16M when it is transmitted again, so this time the transmission is successful carry out

This understanding can explain why the data can be successfully transferred after restart.

0 Likes 0 ·

If this understanding is correct, then the size of block is decided by CASSANDRA Server itself. So is this an ISSUE of Cassandra itself?


0 Likes 0 ·