Tri avatar image
Tri asked jim.dickinson_187342 answered

How stable is the Operator Lifecycle Manager?

I could manage OK to run a 3 nodes Cassandra cluster (Week5, Deploy a Cassandra Cluster with Cass Operator) on my single-node Kubernetes cluster, using microk8s 1.18.6 on Ububtu 20.04 (8 cores, 32 GB RAM).

But when I reached the Metrics part (Monitor a Cluster with the Metrics Collector) The Operator Lifecycle Manager (OLM) killed my microk8s cluster. I think the OLM is overkill for the purpose of the excercise.

Prometheus + Grafana are light weight compared to a 3 nodes Cassandra cluster. And yet this step was very time consuming and finally unstable. I wish you just could deploy Prometheus & Grafana in a more light weight way (without operator). BTW this issue highlights the quality of the Cassandra Operator. Kudos DataStax, this is a high quality CRD.

The Week5 materials are well prepared, presentation and supporting materials. You guys put in a lots of works. Thanks. Too bad, that OLM monster had ruined the experience.

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered Tri commented

We're not contributors to the Operator Lifecycle Manager nor do we have affiliations with the authors/contributors. We included in our workshop just to show participants some of the open-source frameworks/technologies that are available out there. :)

As a side note, we keep running into all sorts of microk8s issues that some of us have given up on it. It feels like it isn't worth the trouble. :(

Thanks for the feedback. You are correct. We spend 200+ hours per week to produce just one workshop. We use technologies like Gitpod and Kubernetes but again, we don't have affiliations with the authors/companies that own them so we're learning at the same time as everyone.

Huge kudos go to @Cedrick Lunven, @Aleks Volochnev, @David Jones-Gilardi, @Jack Fryer, @EricZietlow & @bettina.swynnerton for all the production, delivery and post-event support that happens every single week. As soon as we finish a livestream and the cameras are off, they immediately go on to work on the next livestream -- week in, week out.

It's nice to know that attendees like you appreciate the effort. Thanks for being a supported of our workshops. Cheers!

2 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Tri avatar image Tri commented ·

microk8s works well. Just need a slight modification of the CassandraDatacenter yaml (add a new property allowMultipleNodesPerWorker: true) and a yaml for StorageClass. I can make a PR if you really want.

But the PR is incomplete because I could not complete the metrics part. Simply b/c OLM v0.14.1 had put my microk8s in a kind of bootloop. Which practically killed it. It's not worth to fix b/c OLM 0.14.1 was released in Jan 2020. If you have choosen a more recent version operator-lifecycle-manager, releases maybe this would work better.

Because of that missing metrics part. I didn't submit the PR.

0 Likes 0 ·
Tri avatar image Tri commented ·

I am not totally concinved that KIND is really better than microk8s for small scale dev exercises.

Looking underneath Kind cannot get any more worker than microk8s. Because they all run from within the same machine (for week5 exercises). Quite the contrary, Kind runs inside Docker while microk8s runs on metal.

There is still a lots of things we can do in a single node Kubernetes cluster. Requiring a multinode K8S cluster set the bar pretty high for an exercise. And when you look into the details, this multinodes requirement is actually just to run Prometheus & Grafana. Which is not a good reason in my opinion.

0 Likes 0 ·
Cedrick Lunven avatar image
Cedrick Lunven answered Tri commented

Thank you @Tri for this feedback but also for your pull requests to improve the code week after week.

You can tell we are running one week over this other to produce the content and get everything as smooth as possible. OLM is probably consuming resources, our ideas there was also to show some kind (joke) of UI and not terminal commands only.

Keep the feedback coming !

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Tri avatar image Tri commented ·

The idea of showing the metrics via Prometheus + Grafana is excellent. I would even say this was the motivation I wanted to practice the week5 exercises. As mentioned in my above comment, the choice of OLM actually was a strong constraint that drives the choice of K8S distro and resources.

I would go the low tech route, building your own preconfigured docker images of Prometheus & Grafana. And deploy on K8S without any need of operators. Even though this sounds more labor, I think the learning experience would be much higher. I would volunteer to do this task but unfortunately I have very little time.

Granted they won't have any life cycle management. But the priority of the exercise is to show the Cassandra metrics graphs. Not the funny operator mechanics controlling Prometheus & Grafana which is way out of topic.

0 Likes 0 ·
jim.dickinson_187342 avatar image
jim.dickinson_187342 answered

Cass Operator engineer here. Thank you for the feedback! I like microk8s, and you're right at the end of the day you can't fake resources your single physical computer doesn't have, but I think KIND/k3d presenting a simulation of multiple k8s workers is a little easier to work with for testing vs. one large k8s worker.

As far as OLM being clunky and wanting a lighter weight prom / grafana experience, this is great feedback. We're planning something you'll like. ;)

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.