PostgreSQL FTS is mostly great - I wrote a tutorial on using it to build faceted search with Django a few years ago: <a href="https://simonwillison.net/2017/Oct/5/django-postgresql-faceted-search/" rel="nofollow">https://simonwillison.net/2017/Oct/5/django-postgresql-facet...</a><p>It does have one surprising limitation: it calculates relevance based on just the current row, rather than being able to take statistics across the whole corpus into account.<p>Most search engines use TF/IDF or BM25 for relevance calculations, which consider the relative common-ness of terms in comparison to the rest of the corpus. PostgreSQL FTS can't do that as far as I know.<p>SQLite's built-in FTS CAN do relevance calculations like this! Surprising to see a feature as significant as that show up in SQLite but not in PostgreSQL.
I often see complicated search solutions implemented when PostgreSQL is just sitting there with its incredible FTS powers being ignored. To help some of the teams I'm working with get an idea of the power available to them I wrote up this two part article with a github repo for spinning up a db to follow along.
I use it; it is indeed awesome, although the “powerful” version of the query DSL can be intimidating (I implemented some complex-ish regex transformations to utilize it).<p>It’s fantastic having 1 less dependency though!<p>One caveat- if you migrate any field in the indexed table, you will likely have to drop and recreate all your triggers and stored procs again
I read these articles and am always tempted, but I've found they often don't do well with Chinese, Japanese, Korean or other non-latin languages. I understand that isn't the focus for most people but it's very hard to find good data/information on how to best support these languages.
I've had problems with performance over large data sets, maybe 1M records and a few < 300 char fields per record.<p>I've never implemented full text search using a dedicated database so maybe 1M is too much to ask for.
Great article and underlying point. Reminds me of <a href="https://www.amazingcto.com/postgres-for-everything/" rel="nofollow">https://www.amazingcto.com/postgres-for-everything/</a> which also highlights FTS by linking to this article: <a href="https://supabase.com/blog/postgres-full-text-search-vs-the-rest">https://supabase.com/blog/postgres-full-text-search-vs-the-r...</a>
The only problem I face is partial querying. For instance, if the user queries rus (and it does not match any lexemme) nothing returns. Any workarounds?