AWS Lambda Cold Start Times

355 pointsby valgazeover 3 years ago

44 comments

The best cold starts are those which aren't noticed by the user. For my blog search (which runs on Lambda), I found a nice way of achieving that [1]: as soon as a user puts the focus to the input field for the search text, this will already submit a "ping" request to Lambda. Then, when they submit the actual query itself, they will hit the already running Lambda most of the times.And, as others said, assigning more RAM to your Lambda than it actually may need itself, will also help with cold start times, as this increases the assigned CPU shares, too.[1] <a href="https://www.morling.dev/blog/how-i-built-a-serverless-search-for-my-blog/" rel="nofollow">https://www.morling.dev/blog/how-i-built-a-serverless-search...</a>

评论 #28842617 未加载

评论 #28841998 未加载

评论 #28846088 未加载

评论 #28843141 未加载

cmcconomyover 3 years ago

My experience with cold starts in Azure Functions Serverless is pretty awful. Like most other Azure services, their affordable consumer grade offerings are designed from the ground up not to be good enough for "serious" use.Cold start times compared to Lambda are worse, and in addition, we would get random 404s which do not appear in any logs; inspecting these 404s indicated they were emitted by nginx, leading me to believe that the ultimate container endpoint was killed for whatever reason but that fact didn't make it back to the router, which attempted and failed to reach the function.Of course the cold start and 404 are mitigated if you pay for the premium serverless or just host their middleware on their own App Service plans (basically VMs)

评论 #28840129 未加载

评论 #28839618 未加载

评论 #28842188 未加载

评论 #28843050 未加载

评论 #28841083 未加载

评论 #28842917 未加载

评论 #28840116 未加载

_fat_santaover 3 years ago

At my last job we built an entire API on top of serverless. One of the things we had to figure out was this cold start time. If a user were to hit an endpoint for the first time, it would take 2x as long as it normally would at first. To combat this we wrote a "runWarm" function that kept the API alive at all times.Sure kind of defeats the purpose of serverless but hey, enterprise software.

评论 #28840497 未加载

评论 #28839127 未加载

评论 #28839394 未加载

评论 #28839294 未加载

评论 #28840161 未加载

评论 #28839468 未加载

评论 #28840332 未加载

评论 #28839664 未加载

评论 #28840217 未加载

评论 #28839906 未加载

评论 #28840399 未加载

评论 #28839135 未加载

psanfordover 3 years ago

Something I discovered recently, for my tiny Go Lambda functions it is basically always worth it to run them at least with 256mb of memory even if they don't need more than 128mb. This is because most of my functions run twice as fast at 256mb than they do at 128mb. Since lambda pricing is memory_limit times execution time, you get better performance for free.Test your lambda functions in different configurations to see if the optimal setting is different than the minimal setting.

评论 #28839435 未加载

jabartover 3 years ago

We run a few .net core lambdas and a few things that make a big difference for latency. 1. pre-jit the package, this reduces cold start times as the JIT doesn't need to run on most items. Still does later to optimize some items. 2 is sticking to the new .net json seralizer. The reference code uses both the new and old newtsonsoft package. The old package has higher memory allocations as it doesn't make use of the Span type.

评论 #28839186 未加载

projectileboyover 3 years ago

AWS Lambda is pretty cool, it just gets used a lot for applications that it was never really designed for. While I wish that Amazon would address the cold start times, if you try to grill your burgers with a cordless drill, you can’t really blame the drill manufacturer when the meat doesn’t cook.

davewritescodeover 3 years ago

The main downside of Lambda, in particular for user facing applications is that the incentives of the cloud provider and you are completely opposed. You (the developer) want a bunch of warm lambdas ready to serve user requests and the cloud provider is looking to minimize costs by keeping the number of running lambdas as low as possible. It's the incentive model that fundamentally makes Lambda a poor choice for these types of applications.Other downsides include the fact that Lambdas have fixed memory sizes. If you have units of work that vary in amount of memory required you're basically stuck paying the costs of the largest units of work unless you can implement some sort of routing logic somewhere else. My company ran into this issue using lambdas to process some data where the 99% of requests were fine running in 256mb but a few required more. There was so way to know ahead of time how much memory the computation would require ahead of time. We ended up finding a way to deal with it but in the short term we had to bump the lambda memory limits.That doesn't even get into the problems with testing.In my experience, Lambdas are best used as glue between AWS components, message processors and cron style tasks.

评论 #28841861 未加载

评论 #28844761 未加载

评论 #28841963 未加载

评论 #28841865 未加载

e3bc54b2over 3 years ago

I just want to appreciate the article. Starting with non-clickbait title, upfront summary, detailed numbers, code for reruns, great graphs, no dreamy story and no advertisement of any kind.It is hosted on Medium but the author has done a banging great job, so gets a pass. If he is reading, excellent work!

rcarmoover 3 years ago

I recently discovered that uWSGI has a "cheap mode" that will hold the socket open but only actually spawn workers when a connection comes in (and kill them automatically after a timeout without any requests).Pertinent options: <a href="https://github.com/piku/piku/blob/master/piku.py#L908" rel="nofollow">https://github.com/piku/piku/blob/master/piku.py#L908</a>If you already have 24/7 compute instances going and can spare the CPU/RAM headroom, you can co-host your "lambdas" there, and make them even cheaper :)

rajin444over 3 years ago

Cool article, but shouldn’t Amazon be providing this kind of info? Surely they have this data internally.

评论 #28840126 未加载

评论 #28839268 未加载

评论 #28839353 未加载

评论 #28839263 未加载

评论 #28839318 未加载

mulmboyover 3 years ago

A pattern I have implemented is to have my API code on both ECS/Fargate and Lambda at the same time, and send traffic to the appropriate one using an Elastic Load Balancer. I flag specific endpoints as "cpu intensive" and have them run on lambda.Implemented by- Duplicating all routes in the API with the "/sls/" prefix (this is a couple of lines in FastAPI)- Setting up a rule in ELB to route to Lambda if the route starts with /sls, or to ECS otherwise.- Set up the CPU intensive routes to automatically respond with a 307 to the same route but prefixed with /sls.Boom, with that the system can handle bursts of CPU intensive traffic (e.g. data exports) while remaining responsive to the simple 99% of requests all on one vCPU.And the same dockerfile, with just a tiny change, can be used both in ECS and Lambda.

tinyprojectsover 3 years ago

If anyone is running into cold start problems on Firebase, I recently discovered you can add .runWith({minInstances: 1}) to your cloud functions.It keeps 1 instance running at all times, and for the most part completely gets rid of cold starts. You have to pay a small cost each month (a few dollars), but its worth it on valuable functions that result in conversions, e.g. loading a Stripe checkout.

评论 #28839853 未加载

losvedirover 3 years ago

I'm surprised Node has cold-start issues. I had it in my mind that JS was Lambda's "native" language and wouldn't have cold start issues at all. Did it used to be like that? Didn't Lambda launch with only support for JS, and maybe a couple other languages that could compile to it?

评论 #28839646 未加载

评论 #28844824 未加载

评论 #28843199 未加载

评论 #28839634 未加载

评论 #28841325 未加载

fulafelover 3 years ago

Container based lambda image configurations (vs zip based) would be a good addition to this comparison. People use them eg to get over the zip based lambda size limit.Also maybe mentione provisioned concurrency (where you pay AWS to keep one or more instances of your lambda warm).Both of these are supported by Serverless framework btw.

评论 #28840789 未加载

haolezover 3 years ago

Slightly off topic, but what's the deal with Azure Functions cold start times in the Consumption (i.e. serverless) plan? I get cold start times in the multi seconds range (sometimes huge values, like 20s). Am I doing something wrong? Or is this expected?

评论 #28839742 未加载

mikesabbaghover 3 years ago

Conclusion here is to write 1 huge lambda instead of several small lambdas. right?

评论 #28841801 未加载

bilalqover 3 years ago

> NodeJs is the slowest runtime, after some time it becomes better(JIT?) but still is not good enough. In addition, we see the NodeJS has the worst maximum duration.The conclusion drawn about NodeJS performance is flawed due to a quirk of the default settings in the AWS SDK for JS compared to other languages. By default, it opens and closes a TCP connection for each request. That overhead can be greater than the time actually needed to interact with DDB.I submitted a pull request to fix that configuration[0]. I expect the performance of NodeJS warm starts to look quite a bit better after that.[0]: <a href="https://github.com/Aleksandr-Filichkin/aws-lambda-runtimes-performance/pull/6" rel="nofollow">https://github.com/Aleksandr-Filichkin/aws-lambda-runtimes-p...</a>

评论 #28846743 未加载

NicoJuicyover 3 years ago

Any experience with Cloudflare workers? They are supposed to have 0 cold start time: <a href="https://blog.cloudflare.com/eliminating-cold-starts-with-cloudflare-workers/" rel="nofollow">https://blog.cloudflare.com/eliminating-cold-starts-with-clo...</a>

erikeriksonover 3 years ago

I was surprised by the quality of this one. That said...Cold starts are a FaaS learning subject but they almost never matter much in practice. What workloads are intermittent and also need extremely low latencies? Usually when I see people worrying about this it is because they have architected their system with call chains and the use case, if it really matters, can be re-architected so that the query result is pre prepared. This is much like search results... Search engines certainly don't process the entire web to service your queries. Instead, they pre-calculate the result for each query and update those results as content CRUD happens.

tiew9Viiover 3 years ago

The Rust metrics are interesting.I've been getting around 20ms cold starts, 1ms warm exec on the 128MB ARM Graviton2 using Rust for the most basic test cases. Graviton2 was slightly slower on cold starts than X86 for me (1-2ms) but who doesn't want to save $0.0000000004 per execution? Adding calls to parameter store/dynamo DB bumps it up a little but still < 120ms cold, and any added latency comes from waiting on the external service calls.Memory usage is 20-30MB, and I haven't done anything to optimise memory. I know I can get rid of a few allocations I'm doing for simplicity if I want to.I've not always been the greatest fan of Lambdas seeing it has hidden complexity orchestrating and a blackbox for debugging. Re-visiting a few years on and with Rust, you get an excellent language, excellent runtime characteristics and substantial cost savings unless you really need more than 128MB memory, i.e. processing large volumes of data per execution in memory or transcoding. Any asynchronous/event-driven service I write, I'll just package as a Rust lambda going forward and pay fractions of a cent per month. I am still on-the-wall with HTTP exposed services as that's a big plumbing exercise and hidden gateway costs but not as adverse to it as I was.

thdxrover 3 years ago

This is great work - thanks for putting this together. We recently got a request to add Rust support to ServerlessStack and looks like there's good reason to :)

davidjfelixover 3 years ago

You can shave even more coldstart time off the golang version by building with `go build -tags lambda.norpc` and deploying it as a custom runtime.

评论 #28846078 未加载

rimutakaover 3 years ago

The articles states that it takes approximately 1000 requests to optimize / warm up. I suspect that they had concurrency set at the default 999, so the first 999 requests would spin up new instances.Does that mean their 15,000 requests were actually 15 requests spread over 1000 instances?

muh_gradleover 3 years ago

Surprised to see such mediocre performance from Node. It was an engineering decision on our team to develop one of our Lambdas with Node and we were deciding between Python and Node. Looks like Go and Rust look very promising.

评论 #28839244 未加载

评论 #28844724 未加载

评论 #28839452 未加载

评论 #28839483 未加载

评论 #28839265 未加载

somethingAlexover 3 years ago

Honestly shocked that rust is about 4 times faster than node for the DynamoDB insert workload in the average case. I knew it'd be faster but I would have expected maybe 50% faster since most of the time is probably spent simply sending the data to dynamoDB and awaiting a response.Also, what is up with Python being faster than Node in the beginning and then getting slower over time? The other languages (apart from graal) get faster over time. I'm referring to the average MS latency at 128MB graph at the bottom.

评论 #28842608 未加载

tyingqover 3 years ago

Nice to have some updated data and comparisons. This article doesn't include the effect if the Lambda has to connect to a VPC though, which adds time for the ENI. Though that was greatly improved in 2019-2020: <a href="https://aws.amazon.com/blogs/compute/announcing-improved-vpc-networking-for-aws-lambda-functions/" rel="nofollow">https://aws.amazon.com/blogs/compute/announcing-improved-vpc...</a>

kidsilover 3 years ago

I find it surprising that NodeJS is showing the worst cold start times. Most likely over 90% of AWS Lambda functions are written NodeJS.

moonchromeover 3 years ago

One thing I wish he included was .NET 5 - since that switched to using Docker images I would be very interested in the differences.

评论 #28840581 未加载

masklinnover 3 years ago

The 128M case is really strange, why do Go and Rust take so much more time to start than higher-capacity machines, and even Python? Do they get run inside a wasm runtime or something, and that runtime has to go back and forth requesting memory which a native python runtime gets “for free”?

评论 #28839828 未加载

评论 #28839320 未加载

yoavaover 3 years ago

The article states that 600ms is a low cold start figure. However, 600ms cold start is still unacceptable for usage by WebApps.For webapp, the figure should be 100ms. The only platform that meets that figure (that I know of) is Velo by Wix with around 50ms cold start for node.js

flurieover 3 years ago

I would have liked to see more values along the lambda "breakpoints" between 1GB and 10GB of memory. Unless things have changed recently, my understanding is that CPU and IO scale up specifically at those breakpoints rather than being continuous.

sp33der89over 3 years ago

I would love to see languages like OCaml, D, Nim benchmarked here as well. They sit sortof in between Go and Rust, where I don't have to deal with manual memory management but get enough expressiveness to write a nice Lambda.

评论 #28839279 未加载

评论 #28839293 未加载

l8againover 3 years ago

I wish the author had done a comparison for Java apps using JLink that generates a custom Java runtime image that contains only the platform modules that are required for a given application, and if that makes a difference.

评论 #28841738 未加载

unravellerover 3 years ago

just the cold-start data: <a href="https://gist.github.com/Aleksandr-Filichkin/925fce9d910e04d2037f18caf3cfc3a5/raw/44d39a5091c822aa1d653246413cf06b404adc01/Cold-start..tsv" rel="nofollow">https://gist.github.com/Aleksandr-Filichkin/925fce9d910e04d2...</a>mirror: <a href="https://scribe.rip/@filia-aleks/aws-lambda-battle-2021-performance-comparison-for-all-languages-c1b441005fd1" rel="nofollow">https://scribe.rip/@filia-aleks/aws-lambda-battle-2021-perfo...</a>

paulmendozaover 3 years ago

I wonder if he used the special .net compilation setting that reduces quick starts. It requires compiling the function on Amazon Linux 2

Havocover 3 years ago

Where feasible I’m using cloudflare workers (and KV) instead to be honest. Less versatile but no tangible cold start time

_wlduover 3 years ago

It's really amazing to see how Lambda has grown to support all of these languages.

xwdvover 3 years ago

If cold start times are an issue use Cloudflare workers instead, they’re always warm.

评论 #28841005 未加载

eldavidoover 3 years ago

C++. 200ms cold start. provided.al2 runtime environment.Lambda success story:Started with a .NET Core API about a year ago. Monolith-first. Mix of clients across mobile and React. Async/await is one of the better things about C# (the language used for ASP.NET Core) and as a result, we were able to do things you'd never consider doing in-process on a system like Ruby on Rails (right on the thread serving the HTTP request), like transcoding a 12 megapixel HEIC upload into JPEG. We just did it, left the connection open, and when it was done, returned an HTTP 200 OK.That worked well for a while and let us serve tons of clients on a single Heroku dyno. The problem: memory. Resizing images takes tens/hundreds of MB when you're doing it into three different formats.Over the last two weeks, I extracted the HEIC->JPEG transcode/resize out of our monolith into a Lambda. I'm extremely happy with how it turned out. We went with C++ because the whole idea was performance, we're going to be doing point cloud processing and other heavyweight stuff, and wanted fine-grained control of memory. Our process has 28MB of dynamic libraries (.so files), starts in 200ms, and runs comfortably on a 512MB instance. We moved to 1024 to provide a margin of safety just in case we get a really large image. The system has progressed into "I don't even think about it"-level reliably. It just works and I pay something like $1 for 40-50k transcode operations. No EC2 instances to manage, no queues, no task runners, no Ruby OOM, no running RabbitMQ, none of that (former ops engineer at a very high-scale analytics company).As a general comment, I don't see many cloud services written in C/C++. This is no doubt partly because those skills just aren't widespread. But I think the bigger lesson is that it might be worth adding a little bit of development complexity to save 10x as much ops overhead. When I explained this setup to my friend, his first reaction was, "Why didn't you just put ImageMagick (the binary) into a container?" Once I explained that actually, I need to get images from S3, and write them into several formats, and manipulate their S3 keys in somewhat complex ways, fire off an HTTP request to a server, and pass a JWT around...sure, I could write this in a shell script, with wget, and curl, and everthing else. But at some point you just have to write the right code for the job using the right tools.I think hybrid approaches like this make the most sense. .NET and Java are great high-productivity tools for running server apps where memory is relatively abundant. I wouldn't try to move a system like that onto Lambda any more than I'd try to do something that more naturally fits with a queue/worker pattern on a webserver. This seems kind of obvious but if I'm being honest, it's probably experience talking a bit.It's also neat to just get back to the metal a bit. Drop the containers, runtime environments, multi-hundred-MB deployment packages, just ship a xx MB package up to the cloud, deploy it, and have it run as a standalone linux binary with all the speed and simplicity that brings. Modern C++ is a totally different animal than 90s C++, I'd encourage giving it a try if you haven't in a while.

jmnicolasover 3 years ago

I just exploded in laughter when I read Java OOM at 128MB.I can't explain to my coworkers, they wouldn't understand.

评论 #28840491 未加载

Kalanosover 3 years ago

i read that they offer C as well. wonder what those times look like

评论 #28842005 未加载

bezossucksover 3 years ago

If you can run your entire function in V8 and don't need node, Cloudflare Workers is MUCH faster, more affordable, more manageable, and more reliable. They get cloned all over the world and you're not region-locked.<a href="https://www.cloudflare.com/learning/serverless/serverless-performance/" rel="nofollow">https://www.cloudflare.com/learning/serverless/serverless-pe...</a>

评论 #28843629 未加载

评论 #28843625 未加载

Shadonototraover 3 years ago

that is why the language for cloud native is GO, and languages like Java/C# are dead tech stack, they failed to reinvent themselves to stay relevant

评论 #28840417 未加载

评论 #28841854 未加载

asdevover 3 years ago

This guy has such a rudimentary understanding that he can't point to real examples of Ethereum applications that support his point. Most smart contracts are immutable and non-upgradeable, which nullifies this entire blog post. Many protocols are also moving to DAO governance for even further decentralization of power. This reflects a highly pervasive issue as people with a somewhat technical background but no experience with crypto networks and applications think they are qualified to give opinions on whether the protocols will make it or not.