Hi! This is a blog post sharing some low-level Linux networking we're doing at Modal with WireGuard.<p>As a serverless platform we hit a bit of a tricky tradeoff: we run multi-tenant user workloads on machines around the world, and each serverless function is an autoscaling container pool. How do you let users give their functions static IPs, but also decouple them from compute resource flexibility?<p>We needed a high-availability VPN proxy for containers and didn't find one, so we built our own on top of WireGuard and open-sourced it at <a href="https://github.com/modal-labs/vprox">https://github.com/modal-labs/vprox</a><p>Let us know if you have thoughts! I'm relatively new to low-level container networking, and we (me + my coworkers Luis and Jeffrey + others) have enjoyed working on this.
this is a really neat writeup! the design choice to make each "exit node" control the local wireguard connections instead of a global control plane is pretty neat.<p>an unfinished project I worked on (<a href="https://github.com/redpwn/rvpn">https://github.com/redpwn/rvpn</a>) was a bit more ambitious with a global control plane and I quickly learned supporting multiple clients especially anything networking related is a tarpit. the focus on linux / aws specifically here and the results achievable from it are nice to see.<p>networking is challenging and this was a nice deep dive into some networking internals, thanks for sharing the details :)
Thanks for sharing. This new feature is neat! It might sound a bit out there, but here's a thought: could you enable assigning unique IP addresses to different serverless instances? For certain use cases, like web scraping, it's helpful to simulate requests coming from multiple locations instead of just one. I think allowing requests to originate from a pool of IP addresses would be doable given this proxy model.
> Modal has an isolated container runtime that lets us share each host’s CPU and memory between workloads.<p>Looks like Modal hosts workloads in Containers, not VMs. How do you enforce secure isolation with this design? A single kernel vulnerability could lead to remote execution on the host, impacting all workloads . Am I missing anything?
Couldn't a NAT instance in-front of containers accomplish this as well (assuming only needed for outbound traffic)? The open source project fck-nat[1] looks amazing for this purpose.<p>[1] <a href="https://fck-nat.dev/stable/" rel="nofollow">https://fck-nat.dev/stable/</a>