TechEcho

Hey HN! We’ve just open-sourced model2vec-rs, a Rust crate for loading and running Model2Vec static embedding models with zero Python dependency. This allows you to embed text at (very) high throughput; for example, in a Rust-based microservice or CLI tool. This can be used for semantic search, retrieval, RAG, or any other text embedding usecase.Main Features:- Rust-native inference: Load any Model2Vec model from Hugging Face or your local path with StaticModel::from_pretrained(...).- Tiny footprint: The crate itself is only ~1.7 mb, with embedding models between 7 and 30 mb.Performance:We benchmarked single-threaded on a CPU:- Python: ~4650 embeddings/sec- Rust: ~8000 embeddings/sec (~1.7× speedup)First open-source project in Rust for us, so would be great to get some feedback!

5 comments

gthompson5123 days ago

How does it handle documents longer than the context length of the model? Sorry there are a ton of these regularly and they don't usually think about this.Edit: it seems like it just splits in to sentences which is a weird thing to do given in English only 95%ish percent agreement is even possible on what a sentence is. ``` // Process in batches for batch in sentences.chunks(batch_size) { // Truncate each sentence to max_length * median_token_length chars let truncated: Vec<&str> = batch .iter() .map(|text| { if let Some(max_tok) = max_length { Self::truncate_str(text, max_tok, self.median_token_length) } else { text.as_str() } }) .collect(); ```

评论 #44026027 未加载

noahbp4 days ago

What is your preferred static text embedding model?For someone looking to build a large embedding search, fast static embeddings seem like a good deal, but almost too good to be true. What quality tradeoff are you seeing with these models versus embedding models with attention mechanisms?

评论 #44023356 未加载

echelon3 days ago

I love that you're doing this, Tananon.We've been using Candle and Cudarc and having a fairly good time of it. We've built a real time drawing app on a custom LCM stack, and Rust makes it feel rock solid. Python is way too flimsy for something like this.The more the Rust ML ecosystem grows, the better. It's a little bit fledgling right now, so every little bit counts.If llama.cpp had instead been llama.rs, I feel like we would have had a runaway success.We'll be checking this out! Kudos, and keep it up!

评论 #44026940 未加载

Havoc4 days ago

Surprised it is so much faster. I would have thought the python one is C under the hood

评论 #44023378 未加载

badmonster3 days ago

How do I load a custom model instead of the ones on Hugging Face?

评论 #44026935 未加载

5 comments

gthompson5123 days ago

评论 #44026027 未加载

noahbp4 days ago

评论 #44023356 未加载

echelon3 days ago

评论 #44026940 未加载

Havoc4 days ago

Surprised it is so much faster. I would have thought the python one is C under the hood

评论 #44023378 未加载

badmonster3 days ago

How do I load a custom model instead of the ones on Hugging Face?

评论 #44026935 未加载

Show HN: Model2vec-Rs – Fast Static Text Embeddings in Rust

5 comments

Show HN: Model2vec-Rs – Fast Static Text Embeddings in Rust

5 comments