question

ROBO avatar image
ROBO asked starlord answered

What is the best way to check if record exist in a large partition table in Cassandra?

I have a transaction table,I want to check a single record exist or not in spark 3.0.3 .

I found two ways

1. with session

example

CassandraConnector(conf).withSessionDo { session =>

not clear who will close session after spark job executed

another one is by catlog configuration i.e hrough dataframe

spark.sql("

2. which one also good paractice for single record update

3.which one also good paractice for multiple record update


spark-cassandra-connector
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

starlord avatar image
starlord answered

I don't believe the 'if exists/if not exists' functionality works for dataframes, only for RDD, which you can reference on this page:

https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md

For dataframes, this is still a feature request, so you'd have to simply insert the new record, which would upsert if it exists or insert if it doesn't. For writing examples you can reference this page:

https://github.com/datastax/spark-cassandra-connector/blob/master/doc/14_data_frames.md#persisting-a-dataframe-to-cassandra-using-the-save-command

Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.