Background
Directories (and by extension filesystems/volumes/disks) used by Apache Cassandra all serve different purposes:
data
- mostly read-heavy unless tables are on LCS with a high-update workload,commitlog
- mostly writes by the nature of commitssolr_data_dir
- both reads and writes, purely for Solr (only applies to nodes running DSE with Search enabled)
These directories need to be on separate disks so they are not competing for the same IO bandwidth. However, this isn't necessary when it comes to NVMe SSDs because those are amazingly fast.
IO schedulers
The noop
scheduler (no operation) uses a first-in-first-out (FIFO) algorithm and is good for volumes backed by multiple disks since the IO is spread across the disks.
The deadline
scheduler splits requests into queues. Each request has a timestamp associated with it and the kernel uses it to calculate an "expiration" on the request, hence the name "deadline". Requests closest to the deadline are prioritised by the scheduler. By default, reads have a shorter expiration (500ms) over writes (5s) effectively prioritising reads over writes.
Recommendation
The Cassandra data
directory is best suited for the deadline
scheduler since reads are given priority over compactions. The choice of scheduler isn't so relevant for the commitlog
since it is almost purely write-only workload.
When it comes to the solr_data_dir
for DSE Search nodes, choose the noop
scheduler when the volume is backed by multiple disks, otherwise use the deadline
scheduler as the default.
As a final point, the choice of scheduler for servers with NVMe SSDs is irrelevant since they are extremely fast and very difficult to saturate. In most cases, it is advisable to not use an I/O scheduler (set to none
) since the kernel will waste resources scheduling I/O requests unnecessarily, again because the disks are very fast.