Some context for those who aren't in the loop: ONNX Runtime (<a href="https://onnxruntime.ai/" rel="nofollow">https://onnxruntime.ai/</a>) is a standardization format for AI models. Nowadays, it's extremely easy to export models in the ONNX format, especially language models with tools like Hugging Face transformers which have special workflows for it.<p>ONNX support in the browser was lacking and limited to CPU, but with a WebGPU backend it may now finally be feasible to run models in the browser on a GPU, which opens up interesting oppertunities. Although from this PR it looks like only a few operations are implemented, no browser-based GPT yet.
A great option but there is wonnx which seems to be more complete and mature.
And the bonus is that it's implemented in Rust (if you are into it).<p><a href="https://github.com/webonnx/wonnx">https://github.com/webonnx/wonnx</a>
A pretty cool library that uses ONNX is transformers.js [1] and they're already working to add WebGPU support.[2]<p>[1] <a href="https://xenova.github.io/transformers.js/" rel="nofollow">https://xenova.github.io/transformers.js/</a><p>[2] <a href="https://twitter.com/xenovacom/status/1650634015060156420" rel="nofollow">https://twitter.com/xenovacom/status/1650634015060156420</a>
As an aside, I love ONNX and the main reason I'm sticking with PyTorch. I was able to develop and train an RL model in Python and then convert it to ONNX and call it from C# production code.<p>It still took a lot of effort but the final version is very performant and reliable.