We're a small team of engineers that spent waay too much time setting up user-triggered distributed workflows with K8s when working on our previous product, even though conceptually we knew exactly what our system is supposed to do [1].<p>And so we decided to build Multinode. Its goal is to eliminate the hassle of setting up compute infrastructure so you can build arbitrarily complex compute without leaving the comfort of your local Python environment.<p>You can use it to set up distributed async workflows, continuously running daemons, APIs, etc., with a help of only 5 additional annotations.<p><i>TECHNICAL DETAILS</i><p>Internally, all the infra is on AWS. All the compute runs on ECS with Fargate using controllers similar to those used by Kubernetes. Unfortunately, Fargate has pretty annoying resource limitations (hence <a href="https://multinode.dev/docs/resources/cpus-gpus-memory" rel="nofollow noreferrer">https://multinode.dev/docs/resources/cpus-gpus-memory</a>) so we will likely port it to ECS on EC2s or just straight to K8s at some point. The task registry and multinode's dict is running on Amazon Aurora. We use Python for the actual product and docs run on Firebase with next.js using one of the official Tailwind templates (I'd really recommend them!).<p><i>FEEDBACK NEEDED</i><p>In its current version, the entirety of your cloud compute would be in Multinode. But the actual #1 reason we decided to build it was these user-triggered autoscaled workflows [1]. We're now playing with an idea of dropping most of the features to focus on solving this core problem in such a way that it's easy to integrate with your existing infrastructure [2] instead of having to build everything in Multinode. Something like Airflow or Dagster but with emphasis on user-triggered workflows with very fast autoscaling and topologies defined at runtime. Do you think we should go with this change or keep Multinode as it is now?<p>Also, apologies for closed alpha at the moment, we're still figuring out how to put automatic pricing or quotas in place and so opening it up to the public would bankrupt us. In practice it will be some markup on our compute cost. Any advice on what is usually reasonable here?<p>Any thoughts and feedback on those 2 points would be very appreciated!<p>[1]: For curious, it was something similar to our chess example: <a href="https://multinode.dev/docs/core-concepts/job#distributing-a-job-workload-using-functions" rel="nofollow noreferrer">https://multinode.dev/docs/core-concepts/job#distributing-a-...</a>
[2]: Ideally, both compute and storage would be provisioned in your own AWS VPC with only the control plane running on our infra for maximum privacy and security.