The Arctic Embedding model series from Snowflake has been one of the most impactful open-source text embedding models! In addition to the open model, which has helped a lot of companies kick off their own inference and fine-tuning services (including us at Weaviate), the Snowflake team has also published incredible research breaking down all the components of how to train these models!<p>I am SUPER EXCITED to share the 110th Weaviate Podcast interviewing Arctic Embed co-authors Luke Merrick and Puxuan Yu -- further joined by Charles Pierse from Weaviate, discussing all things Arctic Embed!<p>The podcast covers the origin of Arctic Embed, pre-training embedding models, Matryoshka Representation Learning, fine-tuning embedding models, synthetic query generation, hard negative mining, and lastly a topic I personally find very interesting: Perspectives on single-vector embedding models compared to ColBERT, SPLADE, or Re-rankers.<p>I hope you enjoy the podcast!<p>YouTube: https://www.youtube.com/watch?v=Kjqv4uk3RCs<p>Spotify: https://creators.spotify.com/pod/show/weaviate/episodes/Arctic-Embed-with-Luke-Merrick--Puxuan-Yu--and-Charles-Pierse---Weaviate-Podcast-110-e2sg168/a-abmi4qd