I am very new to Cassandra, I have worked with Oracle SQL and Mongo DB and I am trying to learn Apache Cassandra to use it in a project I am working on.
I have a certain ammount of sensors(let's say 20), that might increase in the future. They send the data to store every 10 seconds. I am aware of bucketing to deal with this type of situations but wondering which one is better.
- PRIMARY KEY ((sensor_id, day_month_year), reported_at);
- PRIMARY KEY ((sensor_id, month_year), reported_at);
I don't know if using *month_year* is too much data for a single partition and on the other hand I think that if I use *day_month_year* it creates too many partitions and it slows reading too much when trying to get data since it has to access multiple partitions.
Which one should I use? If you have other good sugestions or just some explanations for me I'd like to hear them.