DataStax Academy FAQ

DataStax Academy migrated to a new learning management system (LMS) in July 2020. We are also moving to a new Cassandra Certification process so there are changes to exam bookings, voucher system and issuing of certificates.

Check out the Academy FAQ pages for answers to your questions:


question

Tonny avatar image
Tonny asked ·

cassandra 的 监控时延统计可以做到秒级别吗?

metrics-reporter-config-sample.yaml中的period为10,timeunit为SECONDS。测试发现时延统计是仍然是1min进行一次刷新计算。只是输出频率变为了10s/次。

2020-04-17 15:53:36 前执行一条single查询语句,其max,mean,median,p75,p95,p99等在1min内都没有变化,均为0.7859389999999999,1min后这些指标再次更新。

是否能通过修改配置等方式将其这些监控项做到秒级别

p95=0.0, p98=0.0, p99=0.0, p999=0.0, mean_rate=0.030666057915118984, m1=0.031959604782643106, m5=0.1386074981350139, m15=0.17699030804741228, rate_unit=events/second, duration_unit=milliseconds
TRACE [metrics-logger-reporter-1-thread-1] 2020-04-17 15:53:26,559 Slf4jReporter.java:320 - type=TIMER, name=org.apache.cassandra.metrics.ClientRequest.Latency.Read, count=4, min=0.0, max=0.0, mean=NaN, stddev=0.0, median=0.0, p75=0.0, p95=0.0, p98=0.0, p99=0.0, p999=0.0, mean_rate=0.028482269882182526, m1=0.027053221383234054, m5=0.13406340384371507, m15=0.17503463404473962, rate_unit=events/second, duration_unit=milliseconds
TRACE [metrics-logger-reporter-1-thread-1] 2020-04-17 15:53:36,557 Slf4jReporter.java:320 - type=TIMER, name=org.apache.cassandra.metrics.ClientRequest.Latency.Read, count=5, min=0.6549499999999999, max=0.7859389999999999, mean=0.7859389999999999, stddev=0.0, median=0.7859389999999999, p75=0.7859389999999999, p95=0.7859389999999999, p98=0.7859389999999999, p99=0.7859389999999999, p999=0.7859389999999999, mean_rate=0.03323652824426685, m1=0.037612595448069434, m5=0.13291935335098476, m15=0.17420246122065935, rate_unit=events/second, duration_unit=milliseconds
TRACE [metrics-logger-reporter-1-thread-1] 2020-04-17 15:53:46,557 Slf4jReporter.java:320 - type=TIMER, name=org.apache.cassandra.metrics.ClientRequest.Latency.Read, count=5, min=0.6549499999999999, max=0.7859389999999999, mean=0.7859389999999999, stddev=0.0, median=0.7859389999999999, p75=0.7859389999999999, p95=0.7859389999999999, p98=0.7859389999999999, p99=0.7859389999999999, p999=0.7859389999999999, mean_rate=0.031165032453344035, m1=0.03183837467249468, m5=0.12856173862672934, m15=0.17227759184013047, rate_unit=events/second, duration_unit=milliseconds
TRACE [metrics-logger-reporter-1-thread-1] 2020-04-17 15:53:56,556 Slf4jReporter.java:320 - type=TIMER, name=org.apache.cassandra.metrics.ClientRequest.Latency.Read, count=5, min=0.6549499999999999, max=0.7859389999999999, mean=0.7859389999999999, stddev=0.0, median=0.7859389999999999, p75=0.7859389999999999, p95=0.7859389999999999, p98=0.7859389999999999, p99=0.7859389999999999, p999=0.7859389999999999, mean_rate=0.0293365908203286, m1=0.02695060231048694, m5=0.12434698350573203, m15=0.1703739915169165, rate_unit=events/second, duration_unit=milliseconds
TRACE [metrics-logger-reporter-1-thread-1] 2020-04-17 15:54:06,558 Slf4jReporter.java:320 - type=TIMER, name=org.apache.cassandra.metrics.ClientRequest.Latency.Read, count=5, min=0.6549499999999999, max=0.7859389999999999, mean=0.7859389999999999, stddev=0.0, median=0.7859389999999999, p75=0.7859389999999999, p95=0.7859389999999999, p98=0.7859389999999999, p99=0.7859389999999999, p999=0.7859389999999999, mean_rate=0.027710466823171714, m1=0.022813192330621956, m5=0.12027040449311446, m15=0.168491425236214, rate_unit=events/second, duration_unit=milliseconds
TRACE [metrics-logger-reporter-1-thread-1] 2020-04-17 15:54:16,558 Slf4jReporter.java:320 - type=TIMER, name=org.apache.cassandra.metrics.ClientRequest.Latency.Read, count=5, min=0.6549499999999999, max=0.7859389999999999, mean=0.7859389999999999, stddev=0.0, median=0.7859389999999999, p75=0.7859389999999999, p95=0.7859389999999999, p98=0.7859389999999999, p99=0.7859389999999999, p999=0.7859389999999999, mean_rate=0.026255350440209407, m1=0.019310950394286205, m5=0.11632747163722368, m15=0.1666296605800416, rate_unit=events/second, duration_unit=milliseconds
TRACE [metrics-logger-reporter-1-thread-1] 2020-04-17 15:54:26,557 Slf4jReporter.java:320 - type=TIMER, name=org.apache.cassandra.metrics.ClientRequest.Latency.Read, count=5, min=0.6549499999999999, max=0.7859389999999999, mean=0.7859389999999999, stddev=0.0, median=0.7859389999999999, p75=0.7859389999999999, p95=0.7859389999999999, p98=0.7859389999999999, p99=0.7859389999999999, p999=0.7859389999999999, mean_rate=0.024945569661795546, m1=0.016346366599032474, m5=0.11251380349588663, m15=0.1647884676985463, rate_unit=events/second, duration_unit=milliseconds
TRACE [metrics-logger-reporter-1-thread-1] 2020-04-17 15:54:36,557 Slf4jReporter.java:320 - type=TIMER, name=org.apache.cassandra.metrics.ClientRequest.Latency.Read, count=5, min=0.6549499999999999, max=0.7859389999999999, mean=0.7859389999999999, stddev=0.0, median=0.7859389999999999, p75=0.7859389999999999, p95=0.7859389999999999, p98=0.7859389999999999, p99=0.7859389999999999, p999=0.7859389999999999, mean_rate=0.023760160665180358, m1=0.013836900594443329, m5=0.10882516226769016, m15=0.1629676192816263, rate_unit=events/second, duration_unit=milliseconds
TRACE [metrics-logger-reporter-1-thread-1] 2020-04-17 15:54:46,556 Slf4jReporter.java:320 - type=TIMER, name=org.apache.cassandra.metrics.ClientRequest.Latency.Read, count=5, min=0.0, max=0.0, mean=NaN, stddev=0.0, median=0.0, p75=0.0, p95=0.0, p98=0.0, p99=0.0, p999=0.0, mean_rate=0.02268238538370412, m1=0.011712683482324352, m5=0.1052574490828768, m15=0.16116689053086805, rate_unit=events/second, duration_unit=milliseconds
TRACE [metrics-logger-reporter-1-thread-1] 2020-04-17 15:54:56,556 Slf4jReporter.java:320 - type=TIMER, name=org.apache.cassandra.metrics.ClientRequest.Latency.Read, count=5, min=0.0, max=0.0, mean=NaN, stddev=0.0, median=0.0, p75=0.0, p95=0.0, p98=0.0, p99=0.0, p999=0.0, mean_rate=0.021698043447727484, m1=0.009914572517215723, m5=0.10180669944862339, m15=0.1593860591317929, rate_unit=events/second, duration_unit=milliseconds


cassandrametric监控
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

@Tonny 指标报告器每10秒推送一次数据。您说您的配置是:

    period: 10
    timeunit: 'SECONDS'

看起来它没有变化1分钟,但这是因为您仅执行了1次读取查询。

运行1分钟的多个查询,您会看到统计信息会随着时间而变化。Cheers!

4 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

是的。所以当配置period为10的时候,仅执行一次读取查询后,我觉得应该是10s内没有变化而不是1min,10s后因为没有读取查询,这些指标项应该变化为0。

以max的指标为例,当配置period为10时,max目前展示的是前1min内的时延最大值,我觉得max应该统计前10s内的时延最大值。其他的指标类似。

0 Likes 0 · ·

我上面的理解有问题吗?

0 Likes 0 · ·

@Tonny 您的理解是正确的, 当 period 设置为 10时,指标报告器每10秒会报告一次当前数据。

这里只进行了一次查询数据,所以数据在 15:53:26 到 15:54:36 只变化了一次。 你可以理解为报告器每10秒查询一次当前的数据,但由于在15:53:36 到15:54:46之间 cassandra本身数据没有任何变化。所以会造成一种每分钟才更新的误解。

0 Likes 0 · ·
Tonny avatar image Tonny bonian.hu_177317 ·

所以这里的period设置为10,只是修改了reporter的查询频率,并不会修改metric的监控统计频率(他仍然是1min内的数据进行统计计算)

0 Likes 0 · ·