Hi everyone, faast.js is a library that allows you to use serverless to run batch processing jobs. It makes it super easy to run regular functions as serverless functions. This is one of my first open source projects and I'd be happy to answer any questions here.
From what I can tell, it's the invocation model and deployment that is unique here?<p>You invoke faast from your local machine (or build server, or cron job, whatever), and in turn it deploys some functions to a serverless platform and runs them, then tears them all down when complete. Eg, from the site, this code runs locally:<p><pre><code> import { faast } from "faastjs";
import * as funcs from "./functions";
(async () => {
const m = await faast("aws", funcs);
try {
// m.functions.hello: string => Promise<string>
const result = await m.functions.hello("world");
console.log(result);
} finally {
await m.cleanup();
}
})();
</code></pre>
You wouldn't want to run <i>this code</i> on serverless, as you'd be paying for compute time of just waiting for all the other tasks to complete.<p>It would be useful to see a discussion about how and where to host this entry code, may even a topic on "Running in production".<p>It's definitely a neat idea because if you control the event that kicks everything off anyway (eg: "create monthly invoices" or "build daily reports") you can deploy the latest version of everything, run it and clean it up in essentially a single step.<p>(Please correct me if I've misunderstood any of the details here!)
This can be great for scrapping jobs!<p>There are IP-based rate limiters on sites (linkedIn, facebook, etc), but each lambda has a new public IP so by using faast.js, I can stay under the radar.<p>Plus you can essentially spawn a headless chrome (puppeteer) to do advanced stuff.
Very interesting project, the problem with Serverless service provided by different public cloud vendors is that programming and API are not uniform. I think Faast.js is on the right path to creating a unified interface for different Serverless services.
Love what you did!<p>We resently were exactly in a situation where we had to do heavy processing of ~4000 items each running between 1-10minutes.
To speed the process up we ran it on lambda. That means our process went down from 10h++ on a single core computer to about 15min running it on 4000 lambdas.<p>Your library would have saved us quite some work as it would take away a lot of Aws config, deploy, etc....<p>Btw: I'm thinking of building a similar library for multi core/webworkers for node.js. currently a lot of boilerplate is required on node.js to make a loop run parallel on all cores.
This is very neat! Last year I had to essentially do this on GCP and relied on a very similar implementation. Everyone was surprised to see JS being used for data processing but it worked wonderfully.<p>One thing I want to ask is the retries, how do you handle that currently? I ran into multiple cases where functions would fail for transient reasons.