Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

json avatar image
json asked ·

创建索引,造成大量的GC?

我的一个keyspace 下有三张表,分别为他们创建索引,

索引字段是用雪花算法生成的数字类型。

CREATE CUSTOM INDEX i_liveinfo_id ON class.cl_liveinfo (id) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'mode': 'SPARSE' };

问题分析:

问题:

根据业务的需求:针对一些记录增长速度比较快,写入量比较大的表(高峰日写入过亿),考虑到数据永久保存,及数据的可扩展性,决定将mysql迁移至cassndra

迁移之后产生了报错,影响到数据写入。报错如下:

{"level":"ERROR","ts":"2021-06-12 10:16:20","file":"controller/network.go:30","msg":"UploadNetworkDetails error","error":"gocql: no response received from cassandra within timeout period","uuid":"89b6c07d-62de-447d-85e5-dfb9dbaa59e1"}
{"level":"ERROR","ts":"2021-06-12 10:16:54","file":"controller/network.go:30","msg":"UploadNetworkDetails error","error":"gocql: no response received from cassandra within timeout period","uuid":"59a927d2-1ca8-46df-9a34-58d9dc48c36b"}
{"level":"ERROR","ts":"2021-06-12 10:16:54","file":"controller/network.go:30","msg":"UploadNetworkDetails error","error":"gocql: no response received from cassandra within timeout period","uuid":"b22b589b-6e34-407d-8d9d-9ae74d21dd4a"}
{"level":"ERROR","ts":"2021-06-12 10:16:54","file":"controller/network.go:30","msg":"UploadNetworkDetails error","error":"gocql: no response received from cassandra within timeout period","uuid":"3c3bafc4-673a-4ce1-a7b7-9ff7e53737b1"}
{"level":"ERROR","ts":"2021-06-12 10:16:54","file":"controller/auto_check.go:30","msg":"UploadAutoCheck error","error":"gocql: no response received from cassandra within timeout period","uuid":"a11330ae-8a54-4c62-bae6-ab14af6f541d"}
{"level":"ERROR","ts":"2021-06-12 10:16:54","file":"controller/auto_check.go:30","msg":"UploadAutoCheck error","error":"gocql: no response received from cassandra within timeout period","uuid":"7e4813bd-417c-4473-ac6c-1f08c41cd7bb"}
{"level":"ERROR","ts":"2021-06-12 10:16:54","file":"controller/network.go:30","msg":"UploadNetworkDetails error","error":"gocql: no response received from cassandra within timeout period","uuid":"39f4f485-ca90-4701-bceb-28850b48c26e"}
{"level":"ERROR","ts":"2021-06-12 10:16:54","file":"controller/auto_check.go:30","msg":"UploadAutoCheck error","error":"gocql: no response received from cassandra within timeout period","uuid":"775fdbb2-1743-4f67-8098-5a18b4fec5dd"}
....

分析:

通过分析debug.log 和system.log

发现GC非常频繁

统计system.log

# cat /var/log/cassandra/system.log | grep 'GC '|grep "2021-06-12"|wc -l
193 --一天产生了近两百次GC操作,历史最多近300多次
[root@dcd212 2021-06-15 14:40:49 ~]
# cat /var/log/cassandra/system.log | grep 'GC '|grep "2021-06-12"|awk -F'[ ,:]' '{print $5,$6,$7}'|uniq -c |sort -n
'只保留了一分钟内执行了两次以上的GC记录'
3 2021-06-12 10 08
3 2021-06-12 10 13
3 2021-06-12 10 14
3 2021-06-12 19 54
3 2021-06-12 19 55
4 2021-06-12 10 09
4 2021-06-12 15 05
4 2021-06-12 19 38
5 2021-06-12 10 11
5 2021-06-12 10 12
5 2021-06-12 10 18
6 2021-06-12 19 39
8 2021-06-12 10 15
10 2021-06-12 10 17
13 2021-06-12 10 16

查看:2021-06-12 10:16 和2021-06-12 10:17的GC情况

# cat /var/log/cassandra/system.log | grep 'GC '|grep "2021-06-12 10:16"
INFO [Service Thread] 2021-06-12 10:16:08,679 GCInspector.java:285 - G1 Young Generation GC in 424ms. G1 Eden Space: 14881390592 -> 0; G1 Old Gen: 12725911968 -> 12729713040; G1 Survivor Space: 914358272 -> 1660944384;
INFO [Service Thread] 2021-06-12 10:16:12,857 GCInspector.java:285 - G1 Young Generation GC in 592ms. G1 Eden Space: 9755951104 -> 0; G1 Old Gen: 12725518752 -> 14310792096; G1 Survivor Space: 1660944384 -> 520093696;
INFO [Service Thread] 2021-06-12 10:16:17,105 GCInspector.java:285 - G1 Young Generation GC in 443ms. G1 Eden Space: 6341787648 -> 0; G1 Old Gen: 14310792128 -> 14818476488; G1 Survivor Space: 520093696 -> 478150656;
INFO [Service Thread] 2021-06-12 10:16:23,910 GCInspector.java:285 - G1 Young Generation GC in 555ms. G1 Eden Space: 10745806848 -> 0; G1 Old Gen: 14814282200 -> 14826865104; G1 Survivor Space: 478150656 -> 1224736768;
INFO [Service Thread] 2021-06-12 10:16:25,076 GCInspector.java:285 - G1 Young Generation GC in 420ms. G1 Eden Space: 3372220416 -> 0; G1 Old Gen: 14822670816 -> 16018047464; G1 Survivor Space: 1224736768 -> 142606336;
INFO [Service Thread] 2021-06-12 10:16:30,072 GCInspector.java:285 - G1 Young Generation GC in 450ms. G1 Eden Space: 8128561152 -> 0; G1 Old Gen: 16018047496 -> 16026436104; G1 Survivor Space: 142606336 -> 662700032;
INFO [Service Thread] 2021-06-12 10:16:33,400 GCInspector.java:285 - G1 Young Generation GC in 334ms. G1 Eden Space: 4638900224 -> 0; G1 Old Gen: 16022241816 -> 16680747552; G1 Survivor Space: 662700032 -> 394264576;
INFO [Service Thread] 2021-06-12 10:16:37,680 GCInspector.java:285 - G1 Young Generation GC in 276ms. G1 Eden Space: 6870269952 -> 0; G1 Old Gen: 16676553264 -> 16684942376; G1 Survivor Space: 394264576 -> 855638016;
INFO [Service Thread] 2021-06-12 10:16:40,827 GCInspector.java:285 - G1 Young Generation GC in 352ms. G1 Eden Space: 5930745856 -> 0; G1 Old Gen: 16684942376 -> 17511219752; G1 Survivor Space: 855638016 -> 394264576;
INFO [Service Thread] 2021-06-12 10:16:45,750 GCInspector.java:285 - G1 Young Generation GC in 332ms. G1 Eden Space: 6811549696 -> 0; G1 Old Gen: 17511219784 -> 17515414600; G1 Survivor Space: 394264576 -> 838860800;
INFO [Service Thread] 2021-06-12 10:16:50,453 GCInspector.java:285 - G1 Young Generation GC in 404ms. G1 Eden Space: 7281311744 -> 0; G1 Old Gen: 17515414632 -> 17884512880; G1 Survivor Space: 838860800 -> 1006632960;
INFO [Service Thread] 2021-06-12 10:16:54,643 GCInspector.java:285 - G1 Young Generation GC in 874ms. G1 Eden Space: 6987710464 -> 0; G1 Old Gen: 17884512880 -> 18836619880; G1 Survivor Space: 1006632960 -> 419430400;
INFO [Service Thread] 2021-06-12 10:16:56,223 GCInspector.java:285 - G1 Young Generation GC in 320ms. G1 Eden Space: 3565158400 -> 0; G1 Old Gen: 18832425592 -> 19260244600; G1 Survivor Space: 419430400 -> 176160768;
INFO [Service Thread] 2021-06-12 10:17:04,411 GCInspector.java:285 - G1 Young Generation GC in 458ms. G1 Eden Space: 8665432064 -> 0; G1 Old Gen: 19260244632 -> 19272827536; G1 Survivor Space: 176160768 -> 914358272;
INFO [Service Thread] 2021-06-12 10:17:05,816 GCInspector.java:285 - G1 Young Generation GC in 275ms. G1 Eden Space: 3548381184 -> 0; G1 Old Gen: 19268633248 -> 20178797224; G1 Survivor Space: 914358272 -> 218103808;
INFO [Service Thread] 2021-06-12 10:17:09,467 GCInspector.java:285 - G1 Young Generation GC in 325ms. G1 Eden Space: 7230980096 -> 0; G1 Survivor Space: 218103808 -> 595591168;
INFO [Service Thread] 2021-06-12 10:17:14,061 GCInspector.java:285 - G1 Young Generation GC in 328ms. G1 Eden Space: 7038042112 -> 0; G1 Old Gen: 20178797256 -> 20367540944; G1 Survivor Space: 595591168 -> 746586112;
INFO [Service Thread] 2021-06-12 10:17:17,015 GCInspector.java:285 - G1 Young Generation GC in 276ms. G1 Eden Space: 7222591488 -> 0; G1 Old Gen: 20367540944 -> 20791165640; G1 Survivor Space: 746586112 -> 394264576;
INFO [Service Thread] 2021-06-12 10:17:20,148 GCInspector.java:285 - G1 Young Generation GC in 229ms. G1 Eden Space: 8044675072 -> 0; G1 Old Gen: 20791165640 -> 20791166144; G1 Survivor Space: 394264576 -> 494927872;
INFO [Service Thread] 2021-06-12 10:17:22,120 GCInspector.java:285 - G1 Young Generation GC in 209ms. G1 Eden Space: 7977566208 -> 0; G1 Survivor Space: 494927872 -> 578813952;
INFO [Service Thread] 2021-06-12 10:17:25,177 GCInspector.java:285 - G1 Young Generation GC in 236ms. G1 Eden Space: 7809794048 -> 0; G1 Old Gen: 20791166144 -> 21117620936; G1 Survivor Space: 578813952 -> 343932928;
INFO [Service Thread] 2021-06-12 10:17:36,034 GCInspector.java:285 - G1 Young Generation GC in 212ms. G1 Eden Space: 7507804160 -> 0; G1 Old Gen: 21126009552 -> 21203301584; G1 Survivor Space: 562036736 -> 553648128;
INFO [Service Thread] 2021-06-12 10:17:54,560 GCInspector.java:285 - G1 Young Generation GC in 212ms. G1 Eden Space: 6937378816 -> 0; G1 Old Gen: 21822890720 -> 21818696424; G1 Survivor Space: 494927872 -> 562036736;

统计分析debug.log

INFO [SASI-General:651] 2021-06-12 10:16:38,951 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_autocheck-87df2b90c37211ebbae3a7203a91ffe6/md-429-big-SI_i_cl_autocheck_id.db_26, took 3131 ms.
INFO [Service Thread] 2021-06-12 10:16:40,827 GCInspector.java:285 - G1 Young Generation GC in 352ms. G1 Eden Space: 5930745856 -> 0; G1 Old Gen: 16684942376 -> 17511219752; G1 Survivor Space: 855638016 -> 394264576;
INFO [SASI-General:652] 2021-06-12 10:16:42,088 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/c'l_liveinfo-c06c6770c37211ebbae3a7203a91ffe6/md-28-big-SI_i_cl_liveinfo_id.db_8, took 3239 ms.
INFO [Service Thread] 2021-06-12 10:16:45,750 GCInspector.java:285 - G1 Young Generation GC in 332ms. G1 Eden Space: 6811549696 -> 0; G1 Old Gen: 17511219784 -> 17515414600; G1 Survivor Space: 394264576 -> 838860800;
INFO [SASI-General:653] 2021-06-12 10:16:46,648 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_liveinfo-c06c6770c37211ebbae3a7203a91ffe6/md-28-big-SI_i_cl_liveinfo_id.db_9, took 3208 ms.
INFO [Service Thread] 2021-06-12 10:16:50,453 GCInspector.java:285 - G1 Young Generation GC in 404ms. G1 Eden Space: 7281311744 -> 0; G1 Old Gen: 17515414632 -> 17884512880; G1 Survivor Space: 838860800 -> 1006632960;
INFO [SASI-General:654] 2021-06-12 10:16:51,005 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_liveinfo-c06c6770c37211ebbae3a7203a91ffe6/md-28-big-SI_i_cl_liveinfo_id.db_10, took 3338 ms.
INFO [SASI-General:648] 2021-06-12 10:16:53,267 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_autocheck-87df2b90c37211ebbae3a7203a91ffe6/md-429-big-SI_i_cl_autocheck_id.db_27, took 3350 ms.
INFO [Service Thread] 2021-06-12 10:16:54,643 GCInspector.java:285 - G1 Young Generation GC in 874ms. G1 Eden Space: 6987710464 -> 0; G1 Old Gen: 17884512880 -> 18836619880; G1 Survivor Space: 1006632960 -> 419430400;
INFO [Service Thread] 2021-06-12 10:16:56,223 GCInspector.java:285 - G1 Young Generation GC in 320ms. G1 Eden Space: 3565158400 -> 0; G1 Old Gen: 18832425592 -> 19260244600; G1 Survivor Space: 419430400 -> 176160768;
NFO [SASI-General:647] 2021-06-12 10:16:56,316 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_liveinfo-c06c6770c37211ebbae3a7203a91ffe6/md-28-big-SI_i_cl_liveinfo_id.db_11, took 4122 ms.
INFO [SASI-General:649] 2021-06-12 10:17:00,543 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_liveinfo-c06c6770c37211ebbae3a7203a91ffe6/md-28-big-SI_i_cl_liveinfo_id.db_12, took 2896 ms.
INFO [Service Thread] 2021-06-12 10:17:04,411 GCInspector.java:285 - G1 Young Generation GC in 458ms. G1 Eden Space: 8665432064 -> 0; G1 Old Gen: 19260244632 -> 19272827536; G1 Survivor Space: 176160768 -> 914358272;
INFO [Service Thread] 2021-06-12 10:17:05,816 GCInspector.java:285 - G1 Young Generation GC in 275ms. G1 Eden Space: 3548381184 -> 0; G1 Old Gen: 19268633248 -> 20178797224; G1 Survivor Space: 914358272 -> 218103808;
INFO [SASI-General:650] 2021-06-12 10:17:06,040 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_liveinfo-c06c6770c37211ebbae3a7203a91ffe6/md-28-big-SI_i_cl_liveinfo_id.db_13, took 3807 ms.
INFO [SASI-General:651] 2021-06-12 10:17:08,017 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_autocheck-87df2b90c37211ebbae3a7203a91ffe6/md-429-big-SI_i_cl_autocheck_id.db_28, took 3169 ms.
INFO [Service Thread] 2021-06-12 10:17:09,467 GCInspector.java:285 - G1 Young Generation GC in 325ms. G1 Eden Space: 7230980096 -> 0; G1 Survivor Space: 218103808 -> 595591168;
INFO [SASI-General:652] 2021-06-12 10:17:10,357 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_liveinfo-c06c6770c37211ebbae3a7203a91ffe6/md-28-big-SI_i_cl_liveinfo_id.db_14, took 3229 ms.
INFO [CompactionExecutor:2337] 2021-06-12 10:17:11,779 PerSSTableIndexWriter.java:275 - Scheduling index flush to /data1/data/services/cassandra/data/class/cl_liveinfo-c06c6770c37211ebbae3a7203a91ffe6/md-28-big-SI_i_cl_liveinfo_id.db
INFO [SASI-General:654] 2021-06-12 10:17:12,158 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_liveinfo-c06c6770c37211ebbae3a7203a91ffe6/md-28-big-SI_i_cl_liveinfo_id.db_16, took 379 ms.
INFO [Service Thread] 2021-06-12 10:17:14,061 GCInspector.java:285 - G1 Young Generation GC in 328ms. G1 Eden Space: 7038042112 -> 0; G1 Old Gen: 20178797256 -> 20367540944; G1 Survivor Space: 595591168 -> 746586112;
INFO [SASI-General:653] 2021-06-12 10:17:14,455 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_liveinfo-c06c6770c37211ebbae3a7203a91ffe6/md-28-big-SI_i_cl_liveinfo_id.db_15, took 3247 ms.
INFO [Service Thread] 2021-06-12 10:17:17,015 GCInspector.java:285 - G1 Young Generation GC in 276ms. G1 Eden Space: 7222591488 -> 0; G1 Old Gen: 20367540944 -> 20791165640; G1 Survivor Space: 746586112 -> 394264576;
INFO [Service Thread] 2021-06-12 10:17:20,148 GCInspector.java:285 - G1 Young Generation GC in 229ms. G1 Eden Space: 8044675072 -> 0; G1 Old Gen: 20791165640 -> 20791166144; G1 Survivor Space: 394264576 -> 494927872;
INFO [Service Thread] 2021-06-12 10:17:22,120 GCInspector.java:285 - G1 Young Generation GC in 209ms. G1 Eden Space: 7977566208 -> 0; G1 Survivor Space: 494927872 -> 578813952;
INFO [Service Thread] 2021-06-12 10:17:25,177 GCInspector.java:285 - G1 Young Generation GC in 236ms. G1 Eden Space: 7809794048 -> 0; G1 Old Gen: 20791166144 -> 21117620936; G1 Survivor Space: 578813952 -> 343932928;
INFO [Service Thread] 2021-06-12 10:17:36,034 GCInspector.java:285 - G1 Young Generation GC in 212ms. G1 Eden Space: 7507804160 -> 0; G1 Old Gen: 21126009552 -> 21203301584; G1 Survivor Space: 562036736 -> 553648128;
INFO [SASI-General:647] 2021-06-12 10:17:36,210 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_autocheck-87df2b90c37211ebbae3a7203a91ffe6/md-429-big-SI_i_cl_autocheck_id.db_30, took 3187 ms.
INFO [SASI-General:649] 2021-06-12 10:17:50,517 PerSSTableIndexWriter.java:268 - Flushed index segment /data1/data/services/cassandra/data/class/cl_autocheck-87df2b90c37211ebbae3a7203a91ffe6/md-429-big-SI_i_cl_autocheck_id.db_31, took 2770 ms.
INFO [Service Thread] 2021-06-12 10:17:54,560 GCInspector.java:285 - G1 Young Generation GC in 212ms. G1 Eden Space: 6937378816 -> 0; G1 Old Gen: 21822890720 -> 21818696424; G1 Survivor Space: 494927872 -> 562036736;

从debug日志里可以清楚看到,每次PerSSTableIndexWriter.java:268 - Flushed index segment 都伴随着GC,而GC会造成服务停顿,一般情况下GC不会超过200ms,但是日志里的信息,有的会将近900ms。因此怀疑是GC造成的写入数据失败。

为什么会在id上创建索引:在每个表中为id创建一个索引的是为了在mysql迁移至cassandra之后,回滚查询设置的,在业务中并未使用。

处理方法:删除无用的id索引。

提问:麻烦大神解答一下疑惑,为什么索引会造成大量的GC, 原理是什么?

migration
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

I think the answer is a lot simpler than what you think -- the job/app/task which is loading the data to Cassandra is overloading the nodes leading to the long and frequent GC pauses which is then causing the successive writes to timeout.

The indexing does amplify the writes since an insert requires multiple writes -- one write to save the mutation plus another write to save the index entry.

My suggestion is to either (a) throttle the data load, or (b) increase the capacity of your cluster by adding more nodes. Cheers!

1 comment Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

thank you!

0 Likes 0 ·