question

somrcha1_186924 avatar image
somrcha1_186924 asked Erick Ramirez commented

Is Cassandra a good candidate for high throughput and low latency?

Hi,

We are being requested for cassandra database consideration for the below requirements

1> The database needs to support an average throughput of 90 messages with 175 at pick time (per second)

2>Average response time of 15 sec and max response time 2 minutes.

3>database growth 2 TB per month

4>Referencial Integrity constraints will be there.

Do you suggest Cassandra as good candidate here especially for refrential integrity constraints ..Any alternative way could be suggested?

data modelingperformance
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

smadhavan avatar image
smadhavan answered somrcha1_186924 commented

@somrcha1_186924 , thanks for your question and welcome to considering moving away from traditional RDBMS to a NoSQL!

Apache Cassandra can very easily satisfy the throughput, latency and the data growth requirements as it can very well scale horizontally. Cassandra itself has no concept of referential integrity across tables, but it does support features such as lightweight transactions and batches that you would make your application to achieve those.

Other resources for further reading:

I welcome and encourage you to try these out using DataStax Astra website that offers you an always free-tier to get hands on and even run a low volume production database without putting any credit card on file!

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

somrcha1_186924 avatar image somrcha1_186924 commented ·

Thanks madhavan. This is of much help

1 Like 1 ·
Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

Cassandra is definitely a good choice for systems which require a high throughput and low latency.

Your requires for 175 operations per second and average response time of 15 sec are in fact extremely low for most of Cassandra deployments. A typical 3-node cluster can handle anywhere from 30K to 50K ops/s with a p95 latency of 7-9 milliseconds. Most of the companies I work with have an SLA of 10ms or less 99% of the time.

2TB can be handled easily too. Cassandra clusters scale linearly so as the data density on each node grows -- you just need to add more nodes to increase the capacity of the cluster.

Cassandra does not support JOINs so there is no foreign key lookup. You choose Cassandra because you have a scale problem that you are trying to solve. Queries which require joins are not recommended in Cassandra because they are slow and do not scale well.

To achieve high throughput with super-low latency, the data model is denormalised meaning that you create a table for each application query -- one and only one table will ever be accessed to satisfy an application query.

If you're interested, there's a free hands-on tutorial on the differences with relational databases and Cassandra here -- https://www.datastax.com/dev/modeling. You'll learn about the Cassandra data model and how to handle table joins. Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

somrcha1_186924 avatar image somrcha1_186924 commented ·

Thanks Erick. This is of much Help.

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ somrcha1_186924 commented ·

Happy to help. Cheers!

0 Likes 0 ·