Thank you for this nice writeup! This paper was led by my student Sadjad Fouladi (<a href="https://sadjad.org/" rel="nofollow">https://sadjad.org/</a>), part of a broader theme of coercing a "purely functional-ish" design onto everyday applications. There's a less academic-ese version with a few extended results that was published in ;login: magazine (<a href="https://www.usenix.org/system/files/login/articles/login_fall19_02_fouladi.pdf" rel="nofollow">https://www.usenix.org/system/files/login/articles/login_fal...</a>). There was also a good analysis here (<a href="https://buttondown.email/nelhage/archive/papers-i-love-gg/" rel="nofollow">https://buttondown.email/nelhage/archive/papers-i-love-gg/</a>) and don't miss <a href="https://buttondown.email/nelhage/archive/http-pipelining-s3-and-gg/" rel="nofollow">https://buttondown.email/nelhage/archive/http-pipelining-s3-...</a> .<p>Some of Sadjad's other work has included:<p>- ExCamera, which somewhat kicked off the trend of "fire up 4,000 lambda workers in a burst, all working on one job" -- for things like making a neural network search a video frame-by-frame, video compression in parallel at sub-GOP granularity, etc. (<a href="https://news.ycombinator.com/item?id=16197253" rel="nofollow">https://news.ycombinator.com/item?id=16197253</a>)<p>- Salsify, which reused the "purely functional" video codec from ExCamera to improve WebRTC/Zoom-style live video (<a href="https://news.ycombinator.com/item?id=16964112" rel="nofollow">https://news.ycombinator.com/item?id=16964112</a> , <a href="https://news.ycombinator.com/item?id=20794541" rel="nofollow">https://news.ycombinator.com/item?id=20794541</a>). Sadjad is giving an Applied Networking Research Prize talk about this work at IETF tomorrow.<p>- 3D ray-tracing (running PBRT on thousands of Lambdas, sending rays across the network), SMT/SAT solving, etc.<p>We're working to extend this line of work towards a more general, Wasm-based, "purely functional" operating system where most computations operate on content-addressed data and are content-addressed themselves, and determinism and reproducibility is properties guaranteed by the OS. Sort of analogous to how the operating systems of today (try to) guarantee memory isolation between processes. Imagine, e.g., a Git repository where you could represent the fact that "blob <x> is the result of running computation <y> given tree <z> as input," and anybody can verify that result, or rebase the computation to run on top of their own input. If you're interested in this general area, please consider doing a PhD at Stanford and/or get in touch -- I'm hiring.
My work uses GCP not AWS so I've been experimenting with google cloud run (it's actually parallelizing R code so need the docker container infra). My only problem is that I have very bursty useage and the auto-scaling is too slow. I made one attempt [1] to encourage larger allocation but don't know another way. Do people have experience with this<p>[1] Slightly costly but ~ 5minutes before I need it, I set the minimum instance size to a larger number so it starts ramping up, then when I'm done I lower it
This could be very useful for quantum chemistry simulations, which are generally parallelizable and very CPU intensive. If gg gets tweaked to support MPI, this niche could have a breakthrough!