question

aidartf_193749 avatar image
aidartf_193749 asked Erick Ramirez answered

When is it preferable to use materialized views over BATCH statements?

Hello,

I've recently watched DS220 and have a question related to materialised view and batch operation.

Both of them help to keep tables in consistent state (batch from application perspective and materialised from DB perspective). Materialised view should be based on source table, while batch do not.

Maybe someone could recommend use cases when one is preferable to the other.

Is one faster that another?

materialized views
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered

Data Modeling in Cassandra requires that you denormalize your data by creating one table for every application query. Doing so means that you are "materializing your views".

CQL BATCH statements is the mechanism by which you can keep the data in sync across multiple tables such that updates to the tables are done as one atomic operation.

For convenience, Materialized Views was added as a feature in Cassandra 3.0 (see blog post New in Cassandra 3.0: Materialized Views) to remove the burden from the application of having to keep tables in sync and instead off-load it server-side (CASSANDRA-6477). There are however issues with MVs since task of updating the views with the base table is asynchronous. At some point when the views get so out-of-sync with the base table and the data can't be repaired, the only solution is to drop the problematic view and recreate it to force Cassandra to rebuild the view from scratch.

Due to issues with MVs, they have been reclassified as experimental with appropriate warnings for users who enable them so they at least are fully aware of its shortcomings should they decide to use this feature (CASSANDRA-13959).

For the record, there are lots of organisations who use materialized views in production and using them successfully. The feature does work but if you run into out-of-sync issues, the only workaround available today is to drop and recreate the view. There are ongoing efforts to make MVs stable in C* 4.0 and future releases.

For now, the recommendation is to manually denormalize and use CQL BATCH statements. Materialized views are an option but be aware of the caveats. Cheers!

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

jliew1975_193867 avatar image
jliew1975_193867 answered

Materialised view is considered experimental and I would recommend not using it.

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.