As a (junior) patent examiner, the weaknesses of text search were discussed in my training and have become very clear over time. Many people today think that text search is the "be-all and end-all" of search, but if one wants to be comprehensive, text search should only be one part of a search strategy. Other major components include citation search (forwards and backwards) and classification search.<p>Google can identify many synonyms today, but my experience has been Google frequently misses important synonyms. I've started compiling lists of synonyms and even partial search queries (medical searches call these "hedges") to use when searching. The problem of synonyms is one place where citation and classification search shine, as they are independent of the terminology used (and even <i>language</i> independent in the case of a classification like the IPC). There's no one "best" approach; each of these approaches complement each other. And you can do a "combination" search, e.g., of all the documents citing this document, return all that contain a keyword.<p>Unfortunately classification search has fallen out of favor among the general population, but I can see systems like the Dewey Decimal System being extremely useful when the terminology in a field varies appreciably. Classification search is extremely useful in my work.<p>When I have the time I'll take a close look at this article. Thanks for posting it.
In theory there is amazing progress in retrieval, but in practice we have Google. Maybe their motives are not aligned with search improvement after all.<p>The problem of meaning disambiguation has been solved with neural nets to a much higher degree than it appears in Google's search engine.