This error is because of the fact that is explained better in this Apache Cassandra™ CASSANDRA-12360 ticket, which is because of how Python handles the data format: https://docs.python.org/2/library/datetime.html
A way to make it work with the
COPY command is to add the following to the
.cqlshrc file, which by default is located at
~/.cassandra user's home directory,
Post which our
COPY command will work as follows,
cqlsh:test> copy test.date_format to '/path/to/date_format.csv'; Using 7 child processes Starting copy of test.date_format with columns [a, b]. Processed: 2 rows; Rate: 5 rows/s; Avg. rate: 5 rows/s 2 rows exported to 1 files in 0.211 seconds. cqlsh:test> exit $ cat /path/to/date_format.csv 1,-912915758 2,-552716722
Using the above is not recommended and the preferred approach here is to leverage the DataStax Bulk Loader (aka DSBulk)
unload which would handle them right to workaround the limitations of the Python
datetime type as follows,
$ dsbulk load -k test -t date_format -url test Operation directory: /path/to/logs/LOAD_20200714-235309-741763 total | failed | rows/s | mb/s | kb/row | p50ms | p99ms | p999ms | batches 2 | 0 | 18 | 0.00 | 0.01 | 8.92 | 13.11 | 13.11 | 1.00 Operation LOAD_20200714-235309-741763 completed successfully in 0 seconds. Last processed positions can be found in positions.txt
6 People are following this question.
COPY FROM command returns error "field larger than field limit (131072)"
Exporting with COPY TO command returns "Exported 22 ranges out of 25 total ranges, some records might be missing"
Loading data with COPY FROM returns ParseError, "can't interpret '08' as a time"
Loading data with COPY FROM command returning "unhashable type: 'bytearray'"
DataStax Enterprise is powered by the best distribution of Apache Cassandra ™
© 2023 DataStax, Titan, and TitanDB are registered trademarks of DataStax, Inc. and its subsidiaries in the United States and/or other countries.