irisha_mur_186571 avatar image
irisha_mur_186571 asked smadhavan commented

Do the data types of Solr-indexed columns affect CPU utilisation on nodes?

I had a search index on big table with primitive types of fields (text, double, integer) for indexing. For new requirements, it was decided to create new complex types for data (using "CREATE TYPE... " statements), but after it I see too high CPU consumption per node (solr joinings are disabled, but solr node on the machine this Cassandra node).

Is it correct to conclude that Solr indexing complex types blame on high CPU consumption? (Or how I can get testimonies from OpsCenter?)

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total. avatar image commented ·

what DSE version?

0 Likes 0 ·
smadhavan avatar image smadhavan ♦ commented ·

@irisha_mur_186571, in addition to the questions asked by @Erick Ramirez &, do also post answers for the following,

  • Has the recommended settings been applied on the DSE machines?
  • Single token architecture is always good for Search workload nodes. If you've vnodes turned on, please provide what is the number being used? (maybe a diagnostic tarball could also help)
0 Likes 0 ·

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered

As a general rule, user-defined types (UDTs) are more expensive to index than primitive CQL data types because every UDT is a nested document in Search meaning that it results in 1 additional Solr document. But how much impact it has on your cluster will vary widely as it depends on the CQL schema, Solr schema, how much nesting is involved in your UDT, etc.

If you provide additional information, I can ask one of our engineers to review it and provide their thoughts:

  • sample CQL schema of your UDT
  • sample CQL schema of the table
  • sample Solr schema
  • DSE version

Please post the additional info by editing your original question. Cheers!

10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.