I clicked expecting a single full multimodal LLM made by merging multiple existing models into one like the title suggests (which sounds very interesting), and I found... a library which is an LLM router/calls a bunch of LLM web APIs and exposes that under a unified/easy to use interface?<p>With all due respect, sorry, but this title is very misleading. I'd expect "build an LLM" to mean, well, actually building an LLM, and while it's a very nice library it's definitely <i>not</i> what the title suggests.
I'll jump in before the haterade engine wakes up -- great bit of engineering work here! I can't imagine a better level of abstracting away the unnecessary stuff while still retaining that level of manual control.<p>The only thing I don't see is setup for local/in-house LLMs, but it's easy enough to spoof OpenAI calls if necessary.
Whoa, great to see Yoeven's work here. I learned about JigsawStack when I applied for a role there and was super impressed with what he's built. We ended up having a call and he was able to tell me a bit more about what he's working on.<p>He is a friendly and super down-to-earth guy who has made some remarkably good progress on building a platform that just works. For instance, easily connecting a fine-tuned LLM that knows how to scrape content to a translation LLM and wrapping that up in a platform with a really good developer experience.<p>If you're interested in kind of thing, he also did a ShowHN last year on Dzero, a distributed SQLite database built on Cloudflare D1:
<a href="https://news.ycombinator.com/item?id=40563729">https://news.ycombinator.com/item?id=40563729</a>
You clearly don't understand what multimodal means. Multimodal is for example new gemini where you can input green car and get the very same car, only with red paint. Multimodal LLM can do the edit in the latent space, which is the key.<p>Very misleading title, and you won't get away with it by using word "mulimodal" either.