question

dtln812 avatar image
dtln812 asked dtln812 commented

Is it a good idea to increase max connections per host?

I have a server that potentially could receive a huge number of specific requests per second, each such request should persist some data in Cassandra. Because of the limitations of 2048 simultaneous requests, Cassandra starts to throw BusyPoolException.

So my questions are:
1. I know Java driver for Cassandra have throttler that you could configure to enqueue requests which should wait other requests to end. I didn't see in the documentation such configuration for C# driver. It's not documented, or C# driver currently doesn't have such configuration, and the solution will be to write my own throttler based on Semaphore?
2. Is it a good practice to set max connections per host number higher than default (2 local, 1 remote), and if it's okay to do so - what number is considered a middle ground kind of?
3. Is it a good practice to increase max requests per connection, or if I decided to increase the number of connections per host & implement a throttling mechanism - then there's no reason to increase max requests per connection?
4. Should the number of max requests per connection & max simultaneous requests per connection threshold be equal, or simultaneous should be lower by some %?
5. I didn't find enough info related to RetryPolicy, so I'm not sure which RetryPolicy would be most fitting for the use case described above?

csharp driver
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

joao.reis avatar image
joao.reis answered dtln812 commented

I'll answer the questions first and at the end I'll add some general comments.

  1. Correct, you would have to write your own throttler outside of the driver which may not be the easiest thing to do, I never tried this so I'm not sure if you could try to leverage a custom RetryPolicy, a custom`LoadBalancingPolicy` or if you would have to completely implement it on top of the driver as a wrapper of sorts.

  2. There's no good number that fits all cases. The default is usually fine for most workloads but if you're running into BusyPoolException then it might make sense to tweak this.

  3. I would say that in that case, there wouldn't be a need to change the default requests per connection

  4. Max simultaneous requests per connection should generally be lower than max requests per connection so that the driver can proactively open more connections in a pool before all the connections of that pool become satured and BusyPoolException is thrown

  5. I don't think a RetryPolicy will help you here, it just allows you to choose whether a request should be retried on the same host, retried on the next host or if it should fail immediately

Implementing your own throttler might not be trivial so I would suggest that you start by tweaking only the number of core connections per host and see how that works out. The number of total application instances you have deployed will be a factor in how much higher the number of connections per host per instance can be so you have to test this and see how it works out.

I would not mess too much with the dynamic resizing of connection pools because it can create more problems for you if the driver is frequently opening connections while in the middle of processing requests (latency impact for example). We are even considering removing this dynamic resizing capability completely in the next major version of the driver and forcing users to just set a static number of connections per host but for now at least the feature is still there if you want to try it out and do some tests with it.

So my final recommendation would be to set the number of core connections and max connections to the same value and increase it according to your expected throughput and latencies considering that each connection handles 2048 simultaneous requests.

Note that increasing the max requests per connection will prevent the connection pools from throwing BusyPoolException but the connections themselves will always limit the number of in flight requests to 2048 regardless so requests will be enqueued in the connection. In a way, you could try and use this as a throttling mechanism because the connection is basically throttling any requests that arrive after the 2048 threshold is hit but throttling will increase latencies so I would still go with the option of just increasing the number of connections first.

6 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

dtln812 avatar image dtln812 commented ·

1. I saw an example based on Semaphore and locking/releasing (https://github.com/alexott/datastax-bootcamp-project/blob/master/java/src/main/java/com/datastax/alexott/bootcamp/SessionLimiter.java)
That's what I was meaning in my question, didn't see though any examples based on custom load balancing or retry policies.

Regarding increasing number of max requests, I think I noticed what you're describing, so yeah, not sure it's the best idea.

I'll try to change then the number of connections, but I'm not sure what's the acceptable number of connections per host?

SetMaxConnectionsPerHost(...)
SetCoreConnectionsPerHost(...)


0 Likes 0 ·
joao.reis avatar image joao.reis dtln812 commented ·

Hmm indeed it should be possible to implement that semaphore based wrapper in C# as well.

To find a good number of connections per host, it mostly comes down to experimentation and testing. I've seen server nodes handling thousands of connections just fine but the throughput was fairly low. The hardware that you're using and OS configuration also matter.

1 Like 1 ·
dtln812 avatar image dtln812 joao.reis commented ·

So then increase the number of connections for now, until I won't be satisfied by the speed of writes into Cassandra, and then I can decrease the number of connections until finding the best performance for most connections. Stop at that number, and implement the throttler, and when the throttler will be overwhelmed too and the writes will be too delayed - just add more nodes. Does that make sense to you?

0 Likes 0 ·
Show more comments
dtln812 avatar image dtln812 commented ·

[Deleted by user]

0 Likes 0 ·
dtln812 avatar image dtln812 commented ·

Also I'm thinking about using maybe Kafka as persistent layer between server and Cassandra, in case I won't have enough time to spot that I need additional nodes for my cluster, I won't lose the potential requests. In that case I'm not sure I'd need a throttler if I have Kafka. If I'll be using Kafka to log the incoming requests, I can have a service which will process Kafka events and delete handled requests from the list.

0 Likes 0 ·
Erick Ramirez avatar image
Erick Ramirez answered dtln812 commented

I'm going to defer to Joao because he is an expert on this matter being the top contributor to the driver. However, I wanted to add that in most cases, the default driver configuration is the right choice.

Unless you have a very specific edge case, implementing your own custom throttler is quite extreme and in all my years, I'm not aware of anyone who has done that. Of course you can do it since the drivers are open-source after all. :)

In my experience, the driver configuration is not really the problem here. The driver issues up to 2048 asynchronous requests per node. When you get the BusyPoolException, it means that all the nodes are busy processing requests and there thousands to get through in the queue.

Another way of putting it is the BusyPoolException is an indication that the cluster has reached maximum capacity because there are more requests hitting the cluster than it can process. You should consider increasing the capacity of your cluster by adding more nodes. Cheers!

3 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

dtln812 avatar image dtln812 commented ·

Thank you for your answer.

Well, I didn't mean to change/write something in the drivers code, rather creating checker for current requests and if limit is hit - enqueuing in some sort of list, digging in the driver code is a little early for me.

Ok, so better leave the default configuration and increasing the number of nodes, and for use cases where number of requests could increase significantly in a very short term - just have a memory queue or Kafka topic or something where the new requests to persist until I'll add more nodes. That would be the most efficient way, right?

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ dtln812 commented ·

It didn't occur to me that posting to Kafka temporarily was going to be easier than adding nodes. :)

So no, I wouldn't say that would be the most efficient way. Cheers!

0 Likes 0 ·
dtln812 avatar image dtln812 Erick Ramirez ♦♦ commented ·

True, but if over night I got x10 more traffic, which I wasn't expecting, and there'll be not enough nodes at the moment - data will be lost. Kafka is just an insurance that if a query will fail, it'll be logged in Kafka to potentially be retried in the future.

0 Likes 0 ·