Title Text search selectivity functions for PostgreSQL.
Student Jan UrbaƄski
Mentor Heikki Linnakangas
Abstract
I will implement better selectivity estimates for text search queries
in PostgreSQL.
When the users enters a query, the planner chooses from many possible
ways of executing it and tries to pick the optimal one. In order to do
that reliably, the planner makes assumptions about how many tuples
will be processed in each execution step. These assumptions are in turn based on statistical data gathered from database relations.
As of now, PostgreSQL assumes that every text search query will return a fixed fraction of the relation's total row count. This is obviously not realistic and can lead to suboptimal query plans.
I will implement a mechanism for gathering statistical data from text documents and using them to get better selectivity estimates for text search queries. It will be based on existing PostgreSQL infrastructure The most frequent lexemes will be computed and stored by a custom typanalyze function and text search selectivity will be estimated based on that data.
The final result will be a contrib module that could possibly me merged into a future core release.