Hi HN!<p>I wanted to share my machine learning compiler, hosting ML models on AWS Lambda.<p>If you go to the linked page, there are instructions to run a script which generates a model using sklearn, feeds it to my endpoint, and then calls the created endpoint on lambda. In case you're unable to run a script, I've also included a video on the page.<p>Unlike other ML hosting services I've seen, where models are spun up behind containers, this service compiles the model down to either a static C library or WASM library, and then has a template lambda function do the HTTP parsing / argument handling / etc.<p>Upsides of this approach:<p><pre><code> - You can directly embed models in applications, callable via FFI
- You can run ML in weird places, like mobile apps, embedded devices, or the browser
</code></pre>
Downsides:<p><pre><code> - I have to implement the model inference for each algorithm
- As such algorithm support is limited right now
</code></pre>
My goal is to make the interface easy enough that anyone who can build a model can use it to deploy their model somewhere it can deliver value, rather than having to request an engineer's help.<p>I'm fine hosting models for now, but I'm also excited for building out custom integrations like running the models on mobile apps, inside existing applications via FFI, or even embedded devices.<p>Please let me know what you think!