question

lucio avatar image
lucio asked Erick Ramirez edited

ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures" info={'failures': 1, 'received_responses': 0, 'required_responses': 2, 'consistency': 'ALL'}

我们的集群 keyspace 配置 'replication_factor': '2'

在使用consistency all/quorum 按照partition key进行查询的时候报副本不够,我们怀疑是数据不一致,想知道有什么处理的办法, 怎么查是哪里出了问题,感谢

查询报错:

ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures" info={'failures': 1, 'received_responses': 0, 'required_responses': 2, 'consistency': 'ALL'}

debug 日志中的错误信息:

WARN  [ReadStage-6] 2020-09-30 16:04:32,726 AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread Thread[ReadStage-6,5,main]: {}
java.lang.IllegalArgumentException: null
 at java.nio.Buffer.limit(Buffer.java:275) ~[na:1.8.0_212]
 at org.apache.cassandra.io.util.CompressedChunkReader$Standard.readChunk(CompressedChunkReader.java:131) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:158) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:39) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalLoadingCache.lambda$new$0(BoundedLocalCache.java:2949) ~[caffeine-2.2.6.jar:na]
 at com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$15(BoundedLocalCache.java:1807) ~[caffeine-2.2.6.jar:na]
 at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1892) ~[na:1.8.0_212]
 at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:1805) ~[caffeine-2.2.6.jar:na]
 at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1788) ~[caffeine-2.2.6.jar:na]
 at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97) ~[caffeine-2.2.6.jar:na]
 at com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66) ~[caffeine-2.2.6.jar:na]
 at org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:236) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:214) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:207) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.io.util.FileHandle.createReader(FileHandle.java:150) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.io.sstable.format.SSTableReader.getFileDataInput(SSTableReader.java:1807) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.columniterator.AbstractSSTableIterator$Reader.seekToPosition(AbstractSSTableIterator.java:359) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.columniterator.AbstractSSTableIterator$IndexState.setToBlock(AbstractSSTableIterator.java:488) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.columniterator.SSTableIterator$ForwardIndexedReader.setForSlice(SSTableIterator.java:266) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.columniterator.AbstractSSTableIterator.<init>(AbstractSSTableIterator.java:123) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.columniterator.SSTableIterator.<init>(SSTableIterator.java:49) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:72) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:65) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.initializeIterator(UnfilteredRowIteratorWithLowerBound.java:107) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.partitionLevelDeletion(LazilyInitializedUnfilteredRowIterator.java:81) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.partitionLevelDeletion(UnfilteredRowIteratorWithLowerBound.java:167) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:764) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDisk(SinglePartitionReadCommand.java:669) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:503) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:422) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:48) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_212]
 at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) ~[apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) [apache-cassandra-3.11.3.jar:3.11.3]
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.11.3.jar:3.11.3]
 at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212]


consistency level
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

wdeng avatar image
wdeng answered

这个错误的意思是:coordinator节点期望从两个副本节点那里读到数据,但是两个节点都超时或者出错,没有能够返回数据。

有几个问题需要你先回答一下:

1. 是只有读到某一个partition key的时候才会出现这个错误吗?还是会随机发生在很多partition上?如果是后者的话,需要看看节点的负载状况,有可能是资源不够,太忙了。但是如果是每次读都出现这个问题的话,你列出的那个Exception可能是根源。

2. 你的错误信息里面列出的问题java.lang.IllegalArgumentException看起来有可能是一个BUG,类似的问题以前也有人碰到过,但是没有真正解决,比如这个JIRA https://issues.apache.org/jira/browse/CASSANDRA-11200 ,建议你尝试一下重现这个问题(比如在另一个小一点的非生产环境的集群上使用同样的schema和测试数据做一下写和读的实验),如果能有办法重现问题的话,可以新建一个JIRA反馈给社区。

3. 一般在生产环境里不建议使用RF=2,这是因为读写CL=QUORUM操作的时候,就跟CL=ALL没有区别了,会降低数据库的可用性。当然,这个跟你碰到的这个问题没有直接关系。只是提醒一下。

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.