PLANNED MAINTENANCE

Hello, DataStax Community!

We want to make you aware of a few operational updates which will be carried out on the site. We are working hard to streamline the login process to integrate with other DataStax resources. As such, you will soon be prompted to update your password. Please note that your username will remain the same.

As we work to improve your user experience, please be aware that login to the DataStax Community will be unavailable for a few hours on:

  • Wednesday, July 15 16:00 PDT | 19:00 EDT | 20:00 BRT
  • Thursday, July 16 00:00 BST | 01:00 CEST | 04:30 IST | 07:00 CST | 09:00 AEST

For more info, check out the FAQ page. Thank you for being a valued member of our community.


question

David Jones-Gilardi avatar image
David Jones-Gilardi asked ·

What is the recommended number of columns indexed per table using DSE Search?

Per the following question regarding DSE Search:

For example, I can limit the number of columns to say less than 10, but would be good to have 10-15 tables indexed. Should I worry about the increased indexes in my cluster?

Obviously part of the sizing question here is what is in these indexes, how much RAM, # replicas, etc... but without knowing that and just looking at the overall number of indexes are there any general guidelines that we can give on the number of columns indexed per table or overall per cluster?

performancesearchcapacity
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

David Jones-Gilardi avatar image
David Jones-Gilardi answered ·

After some conversations and a little research I'm going to take a crack at answering this myself.

First thing, start here -> https://docs.datastax.com/en/dse-planning/doc/planning/capacityPlanningSearch.html. Using Search indexes will use more resources and need proper planning in a production environment. There are many variables to consider.


  • From there, as a general guideline, you should generally keep a dozen or less columns indexed per table.
  • By default, only index the columns you NEED as compared to indexing all columns in a table.
  • Start with just a single table, index, test, and ensure things look good before moving on to the next. Try to avoid indexing ALL of your tables before any testing.
  • Have metrics in place to monitor if something goes wrong.
    • Things like
      • iostat
      • filter cache evictions (the Solr filter cache)
      • how often you're flushing segments
      • GC logs
  • Store your Solr data on a separate SSD from your C* data (This is in the planning doc above)


This is by no means exhaustive, but a start.

Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.