lchu122 asked Erick Ramirez answered

"Couldn't find system_schema or any similarly named keyspaces" in PySpark

System_Schema not found using Datastax Spark Cassandra Connector (Cassandra 3.11.2, Spark 3.0.1)

The idea is to query system_schema.tables to dynamically find all tables within a cassandra cluster using datasources V1.

# query keyspaces
print("\n***Querying keyspaces in Cassandra***")
cassandra_options['keyspace'] = 'system_schema'
cassandra_options['table'] = 'keyspaces'
df_keyspaces ="org.apache.spark.sql.cassandra")\

# keyspaces ="keyspace_name").rdd.flatMap(lambda x: x).collect()

# query tables
print("\n***Querying tables in Cassandra***")
cassandra_options['keyspace'] = 'system_schema'
cassandra_options['table'] = 'tables'
df_tables ="org.apache.spark.sql.cassandra")\


However, when I try to do so, I get the following error:

pyspark.sql.utils.AnalysisException: Couldn't find system_schema or any similarly named keyspaces;

I login to the cqlsh with the required username and I'm able to find and do select statements on the system_schema keyspace just fine. I can also pull other tables we've created just fine via Spark-Cassandra Connector.

Per docs it is implicitly authorized for all users to be able to query system_schema, since it is often used implicitly. Here are the role permissions anyway:

 role      | resource                                              | permissions
 dbadmin |                                                  data | {'ALTER', 'AUTHORIZE', 'CREATE', 'DROP', 'MODIFY', 'SELECT'}

I'm at a loss if we can query system_schema from Spark. We're manually maintaining a list of tables and keyspaces in the meantime, but it would be ideal if we could pull the tables from system_schema.

I'm assuming you can query other tables? I'm just wondering if it's only the system_schema table.
Erick Ramirez answered

You didn't specify which version of Cassandra you're connecting to and I'd like to quickly rule that out. The connector is only tested on open-source Apache Cassandra and DSE so it's not guaranteed to work with other forks/distributions of Cassandra. Cheers!

