Foundry Local is an on-device AI inference solution offering performance, privacy, customization, and cost advantages.<p>Optimize performance using ONNX Runtime and hardware acceleration, Foundry Local will automatically select and download a model variant with the best performance for your hardware. (CUDA if you have NVidia GPU, NPU-optimized model for Qualcomm NPU, and if nothing, CPU-optimized model)<p>Python and js SDK available.<p>If the model is not already available in ONNX, Olive (<a href="https://microsoft.github.io/Olive/" rel="nofollow">https://microsoft.github.io/Olive/</a>) allows to compile existing models in Safetensor or PyTorch format into the ONNX format<p>Foundry Local is licensed under the Microsoft Software License Terms