This is very cool, my read is that it’s basically a WAL for sys-calls, because everything else is deterministic and can be reproduced. It’s a pretty clever, and afaik, novel idea.<p>This seems like a great way to run tests hermetically, but I’m skeptical about this being the best way to run almost anything else. It seems like an expensive and complex way to handle failure. The examples provided (calling stripe API, recording result in DB) seem better handled through higher-level state management vs rerunning the exact binary again. Why recreate the HTTP call, when you could/should write custom logic to handle the failure. Maybe you need to ask stripe about the failure, but an HTTP call to report metrics can be silently dropped on failure.<p>That said, it would be really interesting if you can run this just on a subset of code, you could build dedicated transaction support into key logic.