What is difference between wide column store and wide row / dynamic columns
Bringing together the Apache Cassandra experts from the community and DataStax.
Want to learn? Have a question? Want to share your expertise? You are in the right place!
Not sure where to begin? Getting Started
Cassandra stores data in column families -- a collection of rows which can contain any columns in a sparse fashion. Storage is sparse since only columns that exist are stored in rows. Rows can have any number of columns unlike tables in relational databases which have a fixed size.
To illustrate, consider this table of users' email addresses:
CREATE TABLE user_emails ( username text, email_type text, email_address text ... PRIMARY KEY (userid, email_type) )
In this example, a user can potentially have unlimited email addresses like this:
INSERT INTO user_emails (username, email_type, email_address) \ VALUES ( 'jackjones', 'personal', 'email@example.com') INSERT INTO user_emails (username, email_type, email_address) \ VALUES ( 'jackjones', 'work', 'firstname.lastname@example.org') INSERT INTO user_emails (username, email_type, email_address) \ VALUES ( 'jackjones', 'other', 'email@example.com')
Another example is a table of video comments:
CREATE TABLE video_comments ( video_id text, comment_id text, username text, comment text, PRIMARY KEY (video_id, comment_id) )
Any particular video can have thousands and even millions of comments. Put another way, a partition (a video) can have thousands of columns of comments.
These examples are what is referred to as wide column store (or wide partitions). Cheers!
As explained here:
Tabular databases organize data in rows and columns, but with a twist from the traditional RDBMS. Also known as wide-column stores or partitioned row stores, they provide the option to organize related rows in partitions that are stored together on the same replicas to allow fast queries. Unlike RDBMSs, the tabular format is not necessarily strict. For example, Apache Cassandra™ does not require all rows to contain values for all columns in the table. Like Key/Value and Document databases, Tabular databases use hashing to retrieve rows from the table. Examples include: Cassandra, HBase, and Google Bigtable.
6 People are following this question.