This is a an interesting way of doing text search over a collection of documents, that I suppose makes sense if you only have a small number of documents to search.<p>Generally, an inverted index [1] is used rather than a trie, and collection-based term weights are calculated using something like IDF [2]. You could then use something like cosine to compute length normalised rankings for documents.<p>There are of course many different approaches to text search, but this is the first time I've personally come across this one.<p>If anyone is interested in an in depth look at how text search works, an Introduction to Information Retrieval [3] is available for free and is a fantastic book.<p>[1] <a href="https://en.wikipedia.org/wiki/Inverted_index" rel="nofollow">https://en.wikipedia.org/wiki/Inverted_index</a><p>[2] <a href="https://en.wikipedia.org/wiki/Tf%E2%80%93idf" rel="nofollow">https://en.wikipedia.org/wiki/Tf%E2%80%93idf</a><p>[3] <a href="https://nlp.stanford.edu/IR-book/" rel="nofollow">https://nlp.stanford.edu/IR-book/</a>
Finally a post on search algorithms on HN again!<p>Pardon my surprise, but I think it's kind of strange that we see so little on HN of this topic, which is one of the pillars of Computer Science, and also at the core of one of the most successful IT companies.
If you're a working programmer looking for a bit more depth on text searching, check out Taming Text by Ingersoll. It was written by some real pros, and it's very practical. It goes over everything from the first naive approaches to what drives many engines like Lucene today.<p><a href="https://www.amazon.com/Taming-Text-Find-Organize-Manipulate/dp/193398838X" rel="nofollow">https://www.amazon.com/Taming-Text-Find-Organize-Manipulate/...</a>
I'm a little surprised to not be seeing some classical search algorithms.<p>For online pattern searching, I'd want Boyer-Moore with the the Galil rule for guaranteed linear search. [Due to an unfortunate accident in C++ history, Boost and C++17 skip the Galil rule, leading to O(n+m) runtime vs. O(nm).]<p>For offline, Burrows-Wheeler has performance linear with query string and <i>independent</i> of the length of the searched document.
Certainly its worth taking a look at another javascript library which works for fuzzy search called, fuse, <a href="http://fusejs.io/" rel="nofollow">http://fusejs.io/</a>.
This is neat, and reminds me a bit of the tricks used by RediSearch. I came across it on HN recently and have been geeking out on it since! <a href="http://redisearch.io" rel="nofollow">http://redisearch.io</a>
Lunr, named to correspond to Apache Solr, is also worth a look. <a href="https://lunrjs.com/" rel="nofollow">https://lunrjs.com/</a>