Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

reshab200 avatar image
reshab200 asked reshab200 commented

Why does performance degrade when multiple threads run in parallel?

I have a function that queries multiple partition asynchronously and combine the result it works well when one thread is executing this function, but performance degrades for all threads when multiple threads parallelly execute this function, I am not able to understand why is this happening. Below is my function.


   @Override
public List<ConversationDetail> getConversationDetailByCreateDateAndCmId(Timestamp startDateTimestamp, Timestamp endDateTimestamp, int cmId) {


 LocalDate startDate = startDateTimestamp.toInstant().atZone(ZoneId.systemDefault()).toLocalDate();
 LocalDate endDate = endDateTimestamp.toInstant().atZone(ZoneId.systemDefault()).toLocalDate();


        List<ConversationDetail> results = new ArrayList<>();
        List<ResultSetFuture> futures = new ArrayList<>();


        try {
            for (LocalDate date = startDate.plusDays(1); date.isBefore(endDate); date = date.plusDays(1)) {
                ResultSetFuture resultSetFuture = session.executeAsync(findByCreateDayAndCmId.bind(date, cmId).setReadTimeoutMillis(60000));
                futures.add(resultSetFuture);
            }
            futures.add(session.executeAsync(findByCreateDayAndCmIdAndTimestampGreaterThanEqual.bind(startDate, cmId, startDateTimestamp).setReadTimeoutMillis(60000)));
            futures.add(session.executeAsync(findByCreateDayAndCmIdAndTimestampLessThanEqual.bind(endDate, cmId, endDateTimestamp).setReadTimeoutMillis(60000)));
           for (ResultSetFuture future : futures) {
                try {
                    ResultSet rows = future.get();
                    Iterator<ConversationDetail> it = mapper.map(rows).iterator();
                    while (it.hasNext()){
                        results.add(it.next());
                    }
                }catch (Exception e){
                    System.out.println("Exception1 " + e.getMessage());
                }
            }
        }catch (Exception e){
            System.out.println("Exception2 "+e.getMessage());
        }
        return results;
    }


java driver
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered reshab200 commented

With the limited information you provided, my guess is that your queries are overloading the cluster.

You need to review the underlying queries you are running, particularly if you are not retrieving data by partition key. If you are using ALLOW FILTERING without filtering on a single partition key, it is very expensive and would explain why the performance eventually tanks. Cheers!

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks for the reply, below are the details.

These are the below 3 queries happening

findByCreateDayAndCmId - Select * from conversation_detail where create_day=? and cm_id=?;

findByCreateDayAndCmIdAndTimestampGreaterThanEqual - Select * from conversation_detail where create_day= ? and cm_id=? and create_date >= ?;

findByCreateDayAndCmIdAndTimestampLessThanEqual - Select * from conversation_detail where create_day=? and cm_id=? and create_date <= ?


The table structure for conversation_detail is below, the partition key is create_day and cm_id,create_date,id are clustering columns, I have queried based on these only.

So to query for 1 month of data it is making 30 async request. Its works fine when only 1 thread is executing the function but if we increase the number of threads it slows down.

1 month of data is approx 374000 rows,table contains data 11500000 rows total, also each partition has approx 15000 rows only.

PRIMARY KEY ((create_day),cm_id,create_date,id)

0 Likes 0 ·