TechEcho

I've been working on Scaffoldly since 2020 to simplify AWS Lambda deployments. Recently discovered you can run Hugging Face models efficiently using EFS for caching. Here's what's interesting:<p><pre><code> - Uses EFS for model file persistence - Pre-downloads models after deployment for faster cold starts - Cold start: ~20s (model loading), warm requests: 5-20s (CPU inference) - Fully automated container builds and deployment - Works with private/gated models via HF_TOKEN </code></pre> Example deployment:<p><pre><code> npx scaffoldly create app --template python-huggingface cd python-huggingface && npx scaffoldly deploy </code></pre> Scaffoldly is Open Source and I'm excited for all feedback and contributions from the community!<p><a href="https://github.com/scaffoldly/scaffoldly">https://github.com/scaffoldly/scaffoldly</a><p><a href="https://github.com/scaffoldly/scaffoldly-examples/tree/python-huggingface">https://github.com/scaffoldly/scaffoldly-examples/tree/pytho...</a>

Show HN: Deploy Hugging Face Models to AWS Lambda

no comments

Show HN: Deploy Hugging Face Models to AWS Lambda

no comments