We are migrating our data from Old Cassandra (1.2) to latest 3.11.6. I have written migration script which will read the data from old Cassandra and will fire queries for the new one (so it will be new insert).
After migration, when I am trying to read data, I am finding huge latency from one of the table in new Cassandra (2.5 seconds) compared with old one (600ms).
The total number of columns in this table is almost 60k. There is no clustering column and a single primary key which is unique for each row.
The requirement is to fetch 40-50 columns for 150 unique flow_id. I am using prepared statement which is fired asynchronously from my application and onceall CompletionFuture is done I am parsing the final result. The fetching data itself taking more than 2.5 seconds. Code to fetch result for unique flow_id
allRows = keyList.stream().map((key) -> {
return session.executeAsync(boundStmt.setToken(partitionKeyName, key)
.setExecutionProfile(profile).setTracing(isTracingEnabled)).toCompletableFuture();
}).collect(Collectors.toList()).stream().map(completableFuture -> {
AsyncResultSet asyncResultSet;
try {
//traceIfEnabled(asyncResultSet.getExecutionInfo(), isTracingEnabled);
asyncResultSet = completableFuture.get();
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeGestorException(CassandraErrors.FAILURE, e, "moveForward");
}
return asyncResultSet.one();
}).filter(row -> row != null).collect(Collectors.toList());
The type of query (prepared statement) it is going to use
SELECT src_flow_id,failures,author,authored_at,version_num,description,step_1_time,source,type
,title,steps,url, labels,tags,title_lowercase,ent,user,host,votes,published_time,runs,
views,flow_id FROM flows WHERE token(flow_id) = tokien(?)
where place holder(?) will take the token of the 'key'
Row cache is disabled.
Kindly let me know what i am missing here.
**Details -**
Old Partitoner OrderPreserve
New Partitioner MurMur3
Total number of nodes : 4
Replication factor : 2
Machine configuration :
12 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz processor
48GB of RAM
CF Stats
Read Count: 738
Read Latency: 8.633516260162601 ms
Write Count: 57
Write Latency: 0.1060701754385965 ms
Pending Flushes: 0
Table: flows
SSTable count: 4
Space used (live): 2310544421
Space used (total): 2310544421
Space used by snapshots (total): 0
Off heap memory used (total): 595544
SSTable Compression Ratio: 0.9655157119293131
Number of partitions (estimate): 56942
Memtable cell count: 0
Memtable data size: 0
Memtable off heap memory used: 0
Memtable switch count: 0
Local read count: 37
Local read latency: 10.126 ms
Local write count: 0
Local write latency: NaN ms
Pending flushes: 0
Percent repaired: 100.0
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 234440
Bloom filter off heap memory used: 234408
Index summary off heap memory used: 70416
Compression metadata off heap memory used: 290720
Compacted partition minimum bytes: 51
Compacted partition maximum bytes: 43388628
Compacted partition mean bytes: 13867
Average live cells per slice (last five minutes): 1.0
Maximum live cells per slice (last five minutes): 1
Average tombstones per slice (last five minutes): 75.75956284153006
Maximum tombstones per slice (last five minutes): 372
Dropped Mutations: 0