We're setting up additional monitoring for a couple of DSE clusters hosted in AWS. The "how" part of this seems pretty well documented, but the "what" part seems oddly missing in the publicly available material.
By "the what part" I mean: what DS metrics should definitely be watched, and which are good candidate's to keep an eye on as well? The clusters run Cassandra + Spark + Graph so we're interested in things pertaining to those specifically.
In a former job I could alway just "look at Mike's doc on DSE monitoring" and go from there. But that job being former precludes that approach. Also, in other circumstances I might install OpsCenter, take its starting defaults, and go from there. But OpsCenter isn't a good fit for where we are with our current monitoring.
So, is there a list of "the top 20 metrics you should be watching in DSE" somewhere which could serve as a good starting place for our team? Or should I just have an intern go through these files: https://github.com/datastax/dse-metric-reporter-dashboards/tree/master/grafana/dashboards and pull out the metrics in use there?