Our field is getting closer and closer to architecture and medicine now. We've tech like this we can come out of the fiddling age and deep dive into a serious engineering culture.<p>Quick question: how do you prevent persisting the effect of a DOS attack on those systems?
Does this determinism extend to floating point computations? This has historically been a pain point with multiplayer games where the client state has to be periodically re-synced with the server state due to slowly accumulating drift in floating point calculations.
Where is the state for the side effects stored? Say I have an AWS Lambda that I want to make idempotent. Lambdas don’t have local storage that persists across runs (unless you mount EBS volumes or something) so I presume state can be stored in a DB?
Looks interesting, I wonder if the method of marking functions as 'having side effects' is going to be easy to make fool proof. For instance, I assume that in the example the random number generation is a side effect because it comes from a RNG provided by flawless itself. Would this have worked with a regular Rust function as well?<p>I assume there is going to be some kind of test harness that allows developers to check their workflows.
Looks like a rust alternative to temporal using wasm as a runtime. Love it!<p>Founder of windmill.dev here which is another durability engine in Rust except it's a lot less elegant since we split our workflows into well defined steps in python/typescript/go/bash and can resume from incomplete steps only by restarting from the last step and storing the result of each step forever in our postgres db (using jsonb). The use-cases are clearly different and I can see flawless being so lightweight that you could use it to model UI flow state and scale it to millions on a small server as pointed out by the site.<p>This is fantastic, hopefully one day rust will power all distributed systems.
“Any sufficiently complicated concurrent program in another language contains an ad hoc informally-specified bug-ridden slow implementation of half of
Erlang.” – Virding’s first rule of programming
Very cool, and the approach demonstrated might be of interest to a similar problem we have in Ambient (our WASM game runtime that has competing processes that may need to retry interactions.)<p>That being said - what’s the relation to Lunatic [0]? Are you still working on Lunatic? Is this a side project? Or is it something completely separate?<p>[0]: <a href="https://lunatic.solutions/">https://lunatic.solutions/</a>
> Imagine if you could just start an arbitrary computation and the system guarantees that it will run until completion and all the operations will be performed exactly once.<p>How is this guaranteed? Isn't exactly once delivery in a distributed system impossible?
Recently, another alternative was teased out: <a href="https://www.golem.cloud/" rel="nofollow noreferrer">https://www.golem.cloud/</a><p>From the folks at Ziverge, who've worked on ZIO in Scala.<p>They use a similar approach I believe. It's discussed in this podcast: <a href="https://podcasters.spotify.com/pod/show/happypathprogramming/episodes/85-Scala--Rust--and-Durable-Computing-with-John-De-Goes-e29ca1s" rel="nofollow noreferrer">https://podcasters.spotify.com/pod/show/happypathprogramming...</a>
If I understand correctly, Flawless provides the WASM runtime. If that’s the case why can’t it be entirely hidden to the user? The system provides things like entropy and networking.
how is something like this implemented ? does this hook into the rust compiler ? or does the rust/wasm compiler architecture provide an intermediate step that allows for these kinds of systems to be built on top ?
This looks very cool. I expect you’ll still want the services you talk to to be idempotent, but Flawless still takes a big chunk out of the work - and it seems very flexible too!
I have wanted something similar for Python: where the execution of a function will be interrupted (i.e. via a dedicated exception) and then I can rerun it to the very same point it previously halted while none of prior side effects occur and the previous state within that function gets restored.<p>While I have an idea how to implement it, now after having read the article and comments here, how is this concept called? Does an implementation for Python exist already?
The boring way to do guaranteed execution with retries is to store the work items in Postgres and use a pool of workers. The workers store intermediate results in a cache.<p>Such systems need tooling for diagnosing and fixing problems: metrics, logging, dead-letter queue, inspecting and evicting cached items, retrying dead jobs, adjusting worker settings. Flawless and similar systems will have the same problems and need the same tools.
Only rust and web assembly seems very restrictive, what does it brings to the table vs temporal for example which work on any language and is rock solid?
This approach should also work well for Haskell where "main" is already a pure computation producing a description of what IO actions to take.
The “restart from where it failed”-aspect was a big reason for why I made Mats3. It is message-based, async, transactional, staged stateless services, or message-oriented asynchronous RPC. Due to the transactionality, and the “state lives on the wire”, if a flow fails, it can be restarted from where it left off.<p><a href="https://mats3.io" rel="nofollow noreferrer">https://mats3.io</a>
I’ve seen a couple of these now. Looks like a lot of them take the “we will take your code and compile it”-approach.<p>That’s not a problem per se but it does affect the ability to debug and see relevant stack traces. Its like how sometimes you see transpiled JS when what you really want is the typescript using source maps.
Very tenuously related, but the idea of logging all nondeterministic effects for idempotence can also be used to make locks lock-free!<p><a href="https://arxiv.org/abs/2201.00813" rel="nofollow noreferrer">https://arxiv.org/abs/2201.00813</a>
This deterministic execution pattern reminds me a lot of the approach with Azure Durable Functions and I know I've seen it in other places as well. Curious if anyone knows where that pattern originated.
Is this an explicit feature of the wasm runtime in use here, or might this break with future optimizations in the runtime and the introduction of threads and other similar features?
I was thinking of something like this for running queries for sharded data: the execution could be paused and moved to the server that is closer to the data source at a given runtime.<p>> Notice how flawless takes away the burden of persisting the state.<p>Reminds me of the Wisdom of the Well: "You'll never find a programming language that frees you from the burden of clarifying your ideas." [1]<p>Although solving durability of execution for arbitrary code sounds super cool, I suspect that writing the code in a way that could be checkpointed/resumed more naturally would probably be easier to debug and implement, in the end.<p>--<p>1: <a href="https://xkcd.com/568/" rel="nofollow noreferrer">https://xkcd.com/568/</a>