I am reading cassandra table data from spark job , and I am observing for a column when column value is null , then that column is ignored in spark .
given below is example:
cassandra data:
{last_a_date: '2020-11-01 23:26:24.372000+0000', id: null, username: 'anurag'}
in spark:
{last_a_date: '2020-11-01 23:26:24.372000+0000', username: 'anurag'}
here id column is ignored .
I am assuming this is happening due to cell value is deleted and cassandra has marked this as tombstone . I have below query on this:
1. if table is created with 4 columns and 3 columns are only populated and fourth column is kept as null , would spark read this column value and populated data-frame with this column and not ignoring column as its value is null .
2. What can I do to ensure when column value is deleted , spark read the column instead ignoring column ?