Bringing together the Apache Cassandra experts from the community and DataStax.

Want to learn? Have a question? Want to share your expertise? You are in the right place!

Not sure where to begin? Getting Started

 

question

basantgurung avatar image
basantgurung asked Erick Ramirez edited

Do queries using non-indexed columns cause performance issues with the Spark connector?

While querying Cassandra, the use of a non-indexed column (without partition key) in CQL is not recommended due to performance issues, though we can still run the query using ALLOW FILTERING keywords.

On the other hand, when we fetch data from Cassandra using Spark RDD or Data Frame either through Spark SQL or Data Sources API or Spark-Cassandra Connector, does using the same non-indexed column (without partition key) can cause performance issues?

spark-cassandra-connector
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez edited

Yes, the same performance issues exist whether you're doing a direct CQL query or with the Spark connector.

The use of ALLOW FILTERING with the Spark connector only works well with indexed columns. Cheers!

1 comment Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks, Erick!

0 Likes 0 ·