How is this different than ollama (<a href="https://github.com/jmorganca/ollama">https://github.com/jmorganca/ollama</a>)? I would argue it's even simpler to run LLMs locally with ollama
Mistral have created a docker image which hosts their model in vllm. Vllm creates an openai like http API interface.<p><a href="https://docs.mistral.ai/quickstart/" rel="nofollow noreferrer">https://docs.mistral.ai/quickstart/</a>
Does anyone have any feedback on using those open-source models with any language except English? Particularly non-western group of languages like korean/japanese/chinese?<p>Will assess myself but wonder if anyone tried.
What are Mistrals strengths and weaknesses? I tried it for infrastructure as code and it wasn't able to output more than the most basic examples, let alone modify them.
I really like the post that they mention (<a href="https://www.secondstate.io/articles/fast-llm-inference/" rel="nofollow noreferrer">https://www.secondstate.io/articles/fast-llm-inference/</a>). The reasons for avoiding python all resonate with me. I'm excited to play with WASI-NN (<a href="https://github.com/WebAssembly/wasi-nn">https://github.com/WebAssembly/wasi-nn</a>) and that rust code is very readable to load up a GGUL model.
The WasmEdge README gives me the heebie-jeebies. Starry-eyed emojis, highlights use-case for today's most trendy thing even though it's general-purpose, mentions blockchain. This reeks of former cryptobros chasing the next big thing. I'd trust Wasmtime more.