Recently one of our deployment sees frequent restart of cassandra. cassandra is run in a RHEL container.
From system.log/debug.log, all i notice is that goes to shutdown but could not figure out reason behind it.
Appreciate any help to debug further and understand the actual cause for the shutdown. Also would be great if you could suggest any good reads for maintaining cassandra clusters and best practices.
Cassandra Details:
Version: 3.11.3
Source Path: https://archive.apache.org/dist/cassandra/3.11.3/apache-cassandra-3.11.3-bin.tar.gz
HA mode: 3 node cluster
Shutdown log messages: (ip addresses are changed for privacy reasons)
WARN [ReadStage-1] 2020-11-22 20:56:23,475 ReadCommand.java:569 - Read 551 live rows and 1179 tombstone cells for query SELECT value FROM ... INFO [StorageServiceShutdownHook] 2020-11-22 21:00:29,841 HintsService.java:220 - Paused hints dispatch INFO [StorageServiceShutdownHook] 2020-11-22 21:00:29,842 ThriftServer.java:139 - Stop listening to thrift clients INFO [StorageServiceShutdownHook] 2020-11-22 21:00:29,850 Server.java:176 - Stop listening for CQL clients INFO [StorageServiceShutdownHook] 2020-11-22 21:00:29,851 Gossiper.java:1559 - Announcing shutdown INFO [StorageServiceShutdownHook] 2020-11-22 21:00:29,851 StorageService.java:2289 - Node /1.2.3.4 state jump to shutdown INFO [StorageServiceShutdownHook] 2020-11-22 21:00:31,853 MessagingService.java:992 - Waiting for messaging service to quiesce
logs during startup after shutdown
INFO [main] 2020-11-22 21:03:53,368 CassandraDaemon.java:473 - Hostname: hostname.domain.com INFO [main] 2020-11-22 21:03:53,368 CassandraDaemon.java:480 - JVM vendor/version: OpenJDK 64-Bit Server VM/1.8.0_252 INFO [main] 2020-11-22 21:03:53,369 CassandraDaemon.java:481 - Heap size: 7.922GiB/7.922GiB INFO [main] 2020-11-22 21:03:53,369 CassandraDaemon.java:486 - Code Cache Non-heap memory: init = 2555904(2496K) used = 4287296(4186K) committed = 4325376(4224K) max = 251658240(245760K) INFO [main] 2020-11-22 21:03:53,369 CassandraDaemon.java:486 - Metaspace Non-heap memory: init = 0(0K) used = 16999920(16601K) committed = 17432576(17024K) max = -1(-1K) INFO [main] 2020-11-22 21:03:53,369 CassandraDaemon.java:486 - Compressed Class Space Non-heap memory: init = 0(0K) used = 2050984(2002K) committed = 2228224(2176K) max = 1073741824(1048576K) INFO [main] 2020-11-22 21:03:53,370 CassandraDaemon.java:486 - Par Eden Space Heap memory: init = 671088640(655360K) used = 147641472(144181K) committed = 671088640(655360K) max = 671088640(655360K) INFO [main] 2020-11-22 21:03:53,370 CassandraDaemon.java:486 - Par Survivor Space Heap memory: init = 83886080(81920K) used = 0(0K) committed = 83886080(81920K) max = 83886080(81920K) INFO [main] 2020-11-22 21:03:53,370 CassandraDaemon.java:486 - CMS Old Gen Heap memory: init = 7751073792(7569408K) used = 0(0K) committed = 7751073792(7569408K) max = 7751073792(7569408K) INFO [main] 2020-11-22 21:03:53,370 CassandraDaemon.java:488 - Classpath: /etc/cassandra:/usr/share/cassandra/lib/HdrHistogram-2.1.9.jar:/usr/share/cassandra/lib/ST4-4.0.8.jar:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/lib/antlr-runtime-3.5.2.jar:/usr/share/cassandra/lib/asm-5.0.4.jar:/usr/share/cassandra/lib/caffeine-2.2.6.jar:/usr/share/cassandra/lib/cassandra-driver-core-3.0.1-shaded.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.9.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/commons-math3-3.2.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrent-trees-2.4.0.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.4.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/ecj-4.4.2.jar:/usr/share/cassandra/lib/guava-18.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.0.6.jar:/usr/share/cassandra/lib/hppc-0.5.4.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.13.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.13.jar:/usr/share/cassandra/lib/jamm-0.3.0.jar:/usr/share/cassandra/lib/javax.inject.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jcl-over-slf4j-1.7.7.jar:/usr/share/cassandra/lib/jctools-core-1.2.1.jar:/usr/share/cassandra/lib/jflex-1.6.0.jar:/usr/share/cassandra/lib/jna-4.2.2.jar:/usr/share/cassandra/lib/joda-time-2.4.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/jstackjunit-0.0.1.jar:/usr/share/cassandra/lib/libthrift-0.9.2.jar:/usr/share/cassandra/lib/log4j-over-slf4j-1.7.7.jar:/usr/share/cassandra/lib/logback-classic-1.1.3.jar:/usr/share/cassandra/lib/logback-core-1.1.3.jar:/usr/share/cassandra/lib/lz4-1.3.0.jar:/usr/share/cassandra/lib/metrics-core-3.1.5.jar:/usr/share/cassandra/lib/metrics-jvm-3.1.5.jar:/usr/share/cassandra/lib/metrics-logback-3.1.5.jar:/usr/share/cassandra/lib/netty-all-4.0.44.Final.jar:/usr/share/cassandra/lib/ohc-core-0.4.4.jar:/usr/share/cassandra/lib/ohc-core-j8-0.4.4.jar:/usr/share/cassandra/lib/reporter-config-base-3.0.3.jar:/usr/share/cassandra/lib/reporter-config3-3.0.3.jar:/usr/share/cassandra/lib/sigar-1.6.4.jar:/usr/share/cassandra/lib/slf4j-api-1.7.7.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.1.1.7.jar:/usr/share/cassandra/lib/snowball-stemmer-1.3.0.581.1.jar:/usr/share/cassandra/lib/stream-2.5.2.jar:/usr/share/cassandra/lib/thrift-server-0.3.7.jar:/usr/share/cassandra/apache-cassandra-3.11.3.jar:/usr/share/cassandra/apache-cassandra-thrift-3.11.3.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/stress.jar::/usr/share/cassandra/lib/jamm-0.3.0.jar INFO [main] 2020-11-22 21:03:53,371 CassandraDaemon.java:490 - JVM Arguments: [-Xloggc:/var/log/cassandra/gc.log, -ea, -XX:+UseThreadPriorities, -XX:ThreadPriorityPolicy=42, -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, -XX:StringTableSize=1000003, -XX:+AlwaysPreTouch, -XX:-UseBiasedLocking, -XX:+UseTLAB, -XX:+ResizeTLAB, -XX:+UseNUMA, -XX:+PerfDisableSharedMem, -Djava.net.preferIPv4Stack=true, -XX:+UseParNewGC, -XX:+UseConcMarkSweepGC, -XX:+CMSParallelRemarkEnabled, -XX:SurvivorRatio=8, -XX:MaxTenuringThreshold=1, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:CMSWaitDuration=10000, -XX:+CMSParallelInitialMarkEnabled, -XX:+CMSEdenChunksRecordAlways, -XX:+CMSClassUnloadingEnabled, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -XX:+PrintPromotionFailure, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=10, -XX:GCLogFileSize=10M, -Xms8192M, -Xmx8192M, -Xmn800M, -XX:+UseCondCardMark, -XX:CompileCommandFile=/etc/cassandra/hotspot_compiler, -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar, -Dcassandra.jmx.local.port=7199, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password, -Djava.library.path=/usr/share/cassandra/lib/sigar-bin, -Dcassandra.rpc_port=9161, -Dcassandra.native_transport_port=9041, -Dcassandra.ssl_storage_port=7013, -Dcassandra.storage_port=7012, -Dcassandra.jmx.local.port=7201, -Dcassandra.libjemalloc=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1, -XX:OnOutOfMemoryError=kill -9 %p, -Dlogback.configurationFile=logback.xml, -Dcassandra.logdir=/var/log/cassandra, -Dcassandra.storagedir=/var/lib/cassandra, -Dcassandra-foreground=yes] WARN [main] 2020-11-22 21:03:53,414 NativeLibrary.java:187 - Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK or run Cassandra as root. INFO [main] 2020-11-22 21:03:53,415 StartupChecks.java:140 - jemalloc seems to be preloaded from /usr/lib/x86_64-linux-gnu/libjemalloc.so.1 WARN [main] 2020-11-22 21:03:53,415 StartupChecks.java:169 - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info. INFO [main] 2020-11-22 21:03:53,418 SigarLibrary.java:44 - Initializing SIGAR library INFO [main] 2020-11-22 21:03:53,430 SigarLibrary.java:180 - Checked OS settings and found them configured for optimal performance. WARN [main] 2020-11-22 21:03:53,438 StartupChecks.java:311 - Maximum number of memory map areas per process (vm.max_map_count) 65530 is too low, recommended value: 1048575, you can change it with sysctl. INFO [main] 2020-11-22 21:03:53,596 QueryProcessor.java:116 - Initialized prepared statement caches with 31 MB (native) and 31 MB (Thrift)
I was suspecting it could be due to https://community.datastax.com/questions/6947/index.html, not sure, i may be wrong.
Let me know if you any need any further details.