Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

lavaraja.padala_150810 avatar image
lavaraja.padala_150810 asked lavaraja.padala_150810 commented

Cassandra node is failing with error "Too many open files"

We have a Cassandra cluster of 8 nodes (Apache Cassandra 3.11.11). Due to disk failure on 2 nodes we have removed those nodes from the cluster. While trying to add them back to the cluster the bootstring process is failing with below error.

ERROR [STREAM-IN-/172.29.62.28:7000] 2021-10-10 23:08:03,437 DefaultFSErrorHandler.java:94 - Exiting forcefully due to file system exception on startup, disk failure policy "stop"
org.apache.cassandra.io.FSWriteError: java.io.FileNotFoundException: /cassandra/data/keyspace1/data_tbl-b4f243f986c711e8a0bc25f553f28aa6/me-62630-big-Filter.db (Too many open file)
 at org.apache.cassandra.io.sstable.format.big.BigTableWriter$IndexWriter.flushBf(BigTableWriter.java:496) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.io.sstable.format.big.BigTableWriter$IndexWriter.doPrepare(BigTableWriter.java:516) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.io.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:364) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:168) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.finish(Transactional.java:179) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.io.sstable.format.SSTableWriter.finish(SSTableWriter.java:264) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.finish(SimpleSSTableMultiWriter.java:59) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.io.sstable.format.RangeAwareSSTableWriter.finish(RangeAwareSSTableWriter.java:130) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.streaming.StreamReceiveTask.received(StreamReceiveTask.java:113) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:672) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:539) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:317) ~[apache-cassandra-3.11.11.jar:3.11.11]
 at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_102]
Caused by: java.io.FileNotFoundException: /cassandra/data/keyspace1/data_tbl-b4f243f986c711e8a0bc25f553f28aa6/me-62630-big-Filter.db (Too many open files)
 at java.io.FileOutputStream.open0(Native Method) ~[na:1.8.0_102]

This issue is caused by table keyspace1/data_tbl which has too many sstable under its directory. Few sstables are having size less than 1mb.

[cassandra@host data_tbl -b4f243f986c711e8a0bc25f553f28aa6]$ ls -lcrt |wc -l
371455

As temporary fix we have increased open fie limit but that didn't fix the issue and node is again started failing during bootstrap process with above error. Any solution to fix this issue?

cassandra
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered lavaraja.padala_150810 commented

In my experience when nodes have thousands of tiny files, it's usually cause by your cluster getting overloaded. When a node gets a high volume of writes, the JVM heap is under pressure so memtables constantly flush their contents to disk to free up memory.

The constant flushing results in small amounts of data into SSTables. At some point, the compaction will eventually catchup and coalesce the small files into larger SSTables.

Until such time that compaction catches up, you need to increase the number of open file descriptors on the operating system. Our general recommendation is to set it to one million. You will need to keep increasing it temporarily until you can get through the bootstrap process. Cheers!

1 comment Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thank you. We will increase the limit and try.

0 Likes 0 ·