Performance Issue during Indexing - more optimal search totals calculation
Not LIVE FROM THE MINNESOTA SEARCH SPRINT, but found on the plane ride home...
After cron completes indexing a chunk of nodes, search_update_totals() updates {search_total} with the new totals for each modified word. It performs some of the calculations in php and thus makes 3 SQL statements for each word. Since you can update 100's of nodes with 1000's of words, that's a lot of SQL to perform at the end of each cron run.
The attached patch greatly simplifies this by updating 500 words at a time. I arbitrarily limited it 500 words, not knowing what was a good limit to the SQL packet size.
The one concern I have about this approach is databases other than mysql. I'm relying on the non-standard function LOG10(). I believe that we use GREATEST() somewhere else, so I don't think that this is an issue. So, we need to check if pgsql supports LOG10().
And we should discuss how many words at a time are reasonable to update.