question

Erick Ramirez avatar image
Erick Ramirez asked Erick Ramirez edited

How do I split large SSTables on another server?

How do I split a large SSTable file on a server that is not part of a running cluster?

compactionstcssstablesplit
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

Background

When forcing a major compaction on a table configured with the SizeTieredCompactionStrategy, all the SSTables on the node get compacted together into a single large SSTable. Due to its size, the resulting SSTable will likely never get compacted out since similar-sized SSTables are not available as compaction candidates. This creates additional issues for the nodes since tombstones do not get evicted and keep accumulating, affecting the cluster's performance.

The large SSTables need to be split into multiple smaller SSTables so they can get compacted as normal using the sstablesplit tool. However, this is in an offline tool which requires Cassandra to be shutdown on the node. The steps in this article provide a workaround which does not require downtime.

Prerequisites

  • a separate server with at least 8-16GB of RAM
  • at least double the largest SSTable size in free disk space
  • same version of Cassandra installed (but not running) as source cluster

WARNING - Although it may be possible to run a compatible sstablesplit from another Cassandra version, e.g. split C* 3.0 SSTables with C* 3.11, it is a not a tested configuration so is not recommended.

Procedure

Follow these steps to split a large SSTable on another server that is not part of a cluster.

Step 1 - Copy the large SSTable and all its components from the source node to the alternate server. For example, if splitting SSTable generation 5678 from a C* 3.11 cluster, copy the whole set of *-5678-* files:

md-5678-big-CompressionInfo.db
md-5678-big-CRC.db
md-5678-big-Data.db
md-5678-big-Digest.crc32
md-5678-big-Filter.db
md-5678-big-Index.db
md-5678-big-Statistics.db
md-5678-big-Summary.db
md-5678-big-TOC.txt

WARNING - Only copy SSTables from one source node at a time. DO NOT mix SSTables from multiple source nodes.

Step 2 - Here is a recommended way of running the tool:

$ tools/bin/sstablesplit --debug --no-snapshot -v /path/to/large/sstable/*

Specifying --debug and -v results in additional troubleshooting information reported back to the console. The --no-snapshot flag skips the need for a snapshot since the tool is operating on a secondary copy of the SSTable.

By default, multiple 50MB SSTables will be generated. Alternatively, it is possible to specify a target size using the -s flag, e.g. -s 100 to generate multiple 100MB SSTables.

Step 3 - Copy all the new files (including all component files, e.g. *-Index.db and *-Statistics.db) to the source node.

WARNING - Only copy the new files to the owner of the original large SSTable. Each node owns a portion of the data and copying files onto a node which does not own the data will result in data loss.

Step 4 - Check file permissions on the newly copied files to make sure they match the rest of the SSTables on the node.

Step 5 - On the source node, run nodetool drain then temporarily stop Cassandra.

Step 6 - Move the original large SSTable (and all its component files) out of the data directory.

Step 7 - Start Cassandra.

Post-verification

After starting Cassandra, check the debug.log to confirm that the new SSTables were opened and read.

Run nodetool cfstats against the table and check for statistics such as data size and/or estimated keys.

Troubleshooting

In circumstances where an SSTable is excessively large or contains large partitions, the sstablesplit utility could experience an OutOfMemoryError exception. In this situation, increase the JVM heap size. For example to increase the heap to 8GB, modify the following line in the tools/bin/sstablesplit shell script:

    MAX_HEAP_SIZE="256M"

to:

    MAX_HEAP_SIZE="8192M"

Also see Why forcing major compaction on a table is not ideal.

[Re-published from DataStax Support KB article "How to split large SSTables on another server"]

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.