I have a scenario where I am using Spark Cassandra connector to move data from On-Premises Cassandra cluster to a PaaS Cassandra on cloud, similar to this scenario https://www.datastax.com/blog/migrate-cassandra-apps-cloud-20-lines-code
My setup is-
1. On-Premises Cassandra cluster
2. Spark cluster in cloud
3. PaaS Cassandra cluster in cloud
I need to configure firewall for connectivity from Spark cluster in cloud (using Spark Cassandra connector) to On-Premises Cassandra cluster. I have seen the documentation in the article https://github.com/datastax/spark-cassandra-connector/blob/master/doc/1_connecting.md and the section 'initial contact', but needs some clarification. My questions are (all are related to OnPremises Cassandra cluster) -
1. I understand we can provide any of the nodes (may be a seed node) info in spark.cassandra.connection.host , but does the connector eventually connect (or need ability to connect) to all nodes or multiple nodes in the Cassandra cluster in a specific DC? OR a single node connection ability is good enough?
2. If I have the ability to connect to a single node only through my firewall device, will the functionality work?
Thanks for your guidance!