I’m writing some tooling for chat/conversation & have been thinking a lot about these optimization considerations<p>One amazing piece of tooling I’ve come across is SST: <a href="https://serverless-stack.com/" rel="nofollow">https://serverless-stack.com/</a><p>They build on top of AWS CDK & they have a very clever way to do “local” development by injecting a websockets into your deployed lambda & you can work against real infra instead of mocks
Initialization code is mentioned...<p><i>"In addition to that, consider whether you can move initialization code outside of the handler function."</i><p>There's an example too, but this could use more emphasis. I see quite of a lot of code in the main body of lambdas that doesn't need to be run over and over, things that could be safely cached in a hashmap, etc.
Regarding bullet 4. Provisioned Concurrency:<p>If you use AWS SDK v3 and use node.js runtime 14.x, you can use top level await. Using top level await lets you more easily do async initializations outside your handler code, before invocation. This has a major benefit of reducing cold start latency when using provisioned concurrency.<p>See <a href="https://aws.amazon.com/blogs/compute/using-node-js-es-modules-and-top-level-await-in-aws-lambda/" rel="nofollow">https://aws.amazon.com/blogs/compute/using-node-js-es-module...</a>
> It's not clear at which point Lambda function receives access to more than 2 vCPU cores<p>This is one of the more annoying things about the service. Why can't AWS publish this information? I might get 4 vCPUs at 5308MB today, but there's no guarantee they won't raise that threshold to 6144MB tomorrow and cause my run times to increase. There should be a better way to figure out what my execution environment looks like than trial and error.