PLANNED MAINTENANCE

Hello, DataStax Community!

We want to make you aware of a few operational updates which will be carried out on the site. We are working hard to streamline the login process to integrate with other DataStax resources. As such, you will soon be prompted to update your password. Please note that your username will remain the same.

As we work to improve your user experience, please be aware that login to the DataStax Community will be unavailable for a few hours on:

  • Wednesday, July 15 16:00 PDT | 19:00 EDT | 20:00 BRT
  • Thursday, July 16 00:00 BST | 01:00 CEST | 04:30 IST | 07:00 CST | 09:00 AEST

For more info, check out the FAQ page. Thank you for being a valued member of our community.


question

yariv.amar_82168 avatar image
yariv.amar_82168 asked ·

Are the Summary.db component files required to load data with sstableloader?

hi
i'm using sstableloader to load data to C*. the sstableloader fails during the upload, but with this error:

java.lang.RuntimeException: Failed to list files in /mnt/migration/...mySourceFolder
:
Caused by: java.lang.AssertionError
        at org.apache.cassandra.io.sstable.IndexSummary.<init>(IndexSummary.java:86)
        at org.apache.cassandra.io.sstable.IndexSummary$IndexSummarySerializer.deserialize(IndexSummary.java:350)
        at org.apache.cassandra.io.sstable.format.SSTableReader.loadSummary(SSTableReader.java:905)


after some reading, i've decided to remove the mc-2-big-Summary.db from the source-folder, the loader completed successfully.

question:

1. what is the role of summary.db during sstable loader?

2. is it safe to remove summary.db from the source folder?


Thanks!

sstableloader
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Erick Ramirez avatar image
Erick Ramirez answered ·

@yariv.amar_82168 Yuki already explained that the Summary.db component of the SSTable set is not mandatory to be able to load SSTables with sstableloader. However, it's not a good idea to remove component files.

Our recommendation is to always keep all the SSTable component files together with the Data.db component as a set. There really is no good reason to exclude components from the set. Cheers!

2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

I agree with the recommendation, and i've been doing this export/import many times before. i don't know why the `Summary.db` that time cause the sstableloader to fail loading. that's the only reason to exclude it.

i assume that sstableloader is streaming the data from the source folder into C* cluster, and not just copying files.


will be happy to learn if there is any information i can look for to understand the root cause.


thank you for the help.

0 Likes 0 · ·

Right. I see what you mean now. Cheers!

0 Likes 0 · ·
yukim avatar image
yukim answered ·

Hi,

> 1. what is the role of summary.db during sstable loader?

The contents of Summary.db is the sampling from SSTable's index file (Index.db). Cassandra uses this summary to speed up looking up for the key in index file.

https://docs.datastax.com/en/ddac/doc/datastax_enterprise/dbInternals/dbIntHowDataWritten.html#dbIntHowDataWritten__sstsummary

> 2. is it safe to remove summary.db from the source folder?

Summary.db file is not essential for reading SSTable data. Again, it is there to speed up reading data. Cassandra can recreate the file from Index.db.

You can safely delete it if it is causing trouble.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.