PLANNED MAINTENANCE

Hello, DataStax Community!

We want to make you aware of a few operational updates which will be carried out on the site. We are working hard to streamline the login process to integrate with other DataStax resources. As such, you will soon be prompted to update your password. Please note that your username will remain the same.

As we work to improve your user experience, please be aware that login to the DataStax Community will be unavailable for a few hours on:

  • Wednesday, July 15 16:00 PDT | 19:00 EDT | 20:00 BRT
  • Thursday, July 16 00:00 BST | 01:00 CEST | 04:30 IST | 07:00 CST | 09:00 AEST

For more info, check out the FAQ page. Thank you for being a valued member of our community.


question

irisha_mur_186571 avatar image
irisha_mur_186571 asked ·

Do the data types of Solr-indexed columns affect CPU utilisation on nodes?

I had a search index on big table with primitive types of fields (text, double, integer) for indexing. For new requirements, it was decided to create new complex types for data (using "CREATE TYPE... " statements), but after it I see too high CPU consumption per node (solr joinings are disabled, but solr node on the machine this Cassandra node).

Is it correct to conclude that Solr indexing complex types blame on high CPU consumption? (Or how I can get testimonies from OpsCenter?)

dseperformancesearch
2 comments
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

what DSE version?

0 Likes 0 · ·

@irisha_mur_186571, in addition to the questions asked by @Erick Ramirez & @alex.ott, do also post answers for the following,

  • Has the recommended settings been applied on the DSE machines?
  • Single token architecture is always good for Search workload nodes. If you've vnodes turned on, please provide what is the number being used? (maybe a diagnostic tarball could also help)
0 Likes 0 · ·

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered ·

As a general rule, user-defined types (UDTs) are more expensive to index than primitive CQL data types because every UDT is a nested document in Search meaning that it results in 1 additional Solr document. But how much impact it has on your cluster will vary widely as it depends on the CQL schema, Solr schema, how much nesting is involved in your UDT, etc.

If you provide additional information, I can ask one of our engineers to review it and provide their thoughts:

  • sample CQL schema of your UDT
  • sample CQL schema of the table
  • sample Solr schema
  • DSE version

Please post the additional info by editing your original question. Cheers!

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.