I'm surprised to see that ML-based semantic search is barely touched on in this article. There's a strong focus on entity matching, but an arguably more powerful way to conduct similarity search is to leverage embedding vectors from trained models.<p>A great upside to this approach is that it works for a variety of different types of unstructured data (images, video, molecular structures, geospatial data, etc), not just text. The rise of multimodal models such as CLIP (<a href="https://openai.com/blog/clip" rel="nofollow">https://openai.com/blog/clip</a>) makes this even more relevant today. Combine it with a vector database such as Milvus (<a href="https://milvus.io" rel="nofollow">https://milvus.io</a>) and you'll be able to do this at scale with very minimal effort.