question

vkayanala_42513 avatar image
vkayanala_42513 asked david.cao edited

What are the best tools out there to monitor Cassandra servers?

Cassandra Monitoring:
- Disk / IO latencies
- CPU utilization
- Memory
- JVM metrics

And all needed metrics to monitor Cassandra from nodetool outputs?


Thanks,

-Varun

monitioring
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

adityajain22_141041 avatar image
adityajain22_141041 answered

@vkayanala_42513

The Best tool for DSE is OpsCenter.


But there are many thirparty tools which you can configure like :-

-Datadog

-grafana

-Dynatrace

-APM tools

-Influx

-Telegraf

-etc, etc

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered vkayanala_42513 commented

@vkayanala_42513 In DSE 6.7, we introduced the new DSE Metrics Collector which collects metrics data that you can push to monitoring systems like Prometheus and Graphite. For more info, see Improved Performance Diagnostics With DataStax Metrics Collector. Cheers!

11 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

vkayanala_42513 avatar image vkayanala_42513 commented ·

ThankYou.

0 Likes 0 ·
vkayanala_42513 avatar image vkayanala_42513 commented ·

Thanks for responding to my question.

@Erick Ramirez I'm following this document: https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/tools/metricsCollector/mcExportMetricsManually.html to get Cassandra metrics into Prometheus and then display onto Grafana dashboards.

cont..1

0 Likes 0 ·
vkayanala_42513 avatar image vkayanala_42513 commented ·

I'm facing some challenges in this process.. both Prometheus and Grafana services are up and running, but It can't able to reach dse servers endpoints you can see that in below screenshot.

9103 port opened on these servers what else I need to verify here to get status UP? And also data is not populated on Grafana too?

What I'm missing here ;) Any help would be appreciated!

-Cheers!

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ vkayanala_42513 commented ·

@vkayanala_42513 can you verify that the DSE nodes are listening on port 9103?

0 Likes 0 ·
vkayanala_42513 avatar image vkayanala_42513 Erick Ramirez ♦♦ commented ·

@Erick Ramirez 9103 port is opened on dse servers. But 9103 used for what service? should we start any service to make it listen on?

Output from one of the dse server:
[root@cassandra-perf-dev-005-db ~]# netstat -lnt | grep 9103

[root@cassandra-perf-dev-005-db ~]#



0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ vkayanala_42513 commented ·

@vkayanala_42513 it looks like you missed step 2 of the document and didn't disable/re-enable the DSE Metrics Collector. If you did, the DSE nodes would be listening on port 9103. For example:

$ sudo lsof -i -n -P | grep LISTEN | grep 9103
ld-2.23.s 7601 cassandra    5u  IPv6  31579      0t0  TCP *:9103 (LISTEN)
0 Likes 0 ·
vkayanala_42513 avatar image vkayanala_42513 Erick Ramirez ♦♦ commented ·

@Erick Ramirez Yeah you are right, in step two, prometheus.conf file was placed in wrong directory. Now target end points status is UP. Prometheus is good.

However I'm notable to getting those metrics into Gafana. I noticed this on Grafana dashboards "Datasource named prometheus was not found".

0 Likes 0 ·
Show more comments
amitmund_177224 avatar image
amitmund_177224 answered Erick Ramirez edited

You might like to look for the following open source tools, which can expose the jmx data. Later you can use anything on the top. Put the data in influxdb + plot on Grafana would be easy :).

https://github.com/jmxtrans/jmxtrans

and Telegraf with Jolokia2

https://github.com/influxdata/telegraf/tree/master/plugins/inputs/jolokia2

I don't know much, and still it's a learning curve. However, the following few data might help, and if some else can share there learning, it will be helpful.

EDIT: Moved large amount of text to Gist since it made the post hard to read.

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

david.cao avatar image
david.cao answered david.cao edited

In our env, we use grafana+influxdb+telegraf. it is pretty handy. It is open source free. We can almost get all the key metrics. You can take a look if you are interested here.monitor Cassandra with open source tool

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.