question

victor_188679 avatar image
victor_188679 asked Erick Ramirez commented

Can we store machine learning models in Cassandra?

Currently, we store our machine learning models in the Google cloud store. The model sizes vary from 10MB to 200MB. I am wondering if we can store them in Cassandra instead, not sure Cassandra will provide any performance improvement in terms of read/write or not.

performanceblob
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

I imagine your models are binary data so the only way to store them is in a blob type.

Blobs are intended for storing images or short strings. It is possible to store as much as 2GB of data in a blob but for performance reasons, the recommended size is 1MB. This is because each blob object can't be broken up so can't be streamed.

Most use cases which have components that are binary data (streaming/on-demand services, for example) store the metadata in Cassandra (say the attributes of a video) but the binary data (the video itself) is placed in an object-store such as Google Cloud Store or Amazon S3.

Depending on your use case and access patterns, models as big as 10MB might work and will most certainly be more performant than S3 or GCS but anything larger than 10MB is likely not viable. Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

victor_188679 avatar image victor_188679 commented ·

thank you Erick for the answer, it's very clear.

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ victor_188679 commented ·

Not a problem at all. I would be very interested to find out the outcome if you end up testing the 10MB blobs. Or whether you stick with Google Cloud Store. Cheers!

0 Likes 0 ·