科技回声

15 条评论

nrmitchi超过 5 年前

If I'm reading this right, then this approach takes away any real safety in terms of deployment. There would be no easy rollback mechanism, and no real assurances that the new code version will actually run.I understand that the main goal here seemed to be avoiding time spent in ECS rollouts, but this solution seems to be sacrificing many of the guarantees that the rollout process is designed to provide.The root problem is explicitly called out (slow ECS deployments), and is tied to rate limiting of the ECS `start-task` API call. The post mentions the hard cap on the number of tasks per call, but I'm curious if the actual _rate limit_ could have been increased on the AWS side. Ie, 400 calls would still be needed, but they could be pushed through much faster.

评论 #20822536 未加载

benologist超过 5 年前

Whenever I see these posts I feel like Heroku narrowly missed out on shaping the rest of the cloud just by staying proprietary and expensive.

评论 #20824473 未加载

评论 #20822392 未加载

marcinzm超过 5 年前

My read seems to be: don't use ECS at large scale or you'll need some really convoluted hacks.

评论 #20822465 未加载

评论 #20821913 未加载

评论 #20822268 未加载

评论 #20822087 未加载

testuser5559191超过 5 年前

Slightly off topic:Does Plaid still operate via screen scraping? I'm a little perplexed as to why banks don't have easy to use APIs, especially given recent regulation. It seems against their best interests to allow a third party to screen scrape and provide a service which the banks themselves could easily reproduce.What am I missing? Is a bank with an easy to use API not a sound business decision from the bank's perspective?I know Monzo (challenger bank in UK) has/had an API, though I haven't heard of anyone using it.

评论 #20824326 未加载

sailfast超过 5 年前

Thanks for sharing these lessons!I don't use ECS at the moment but this is a well laid out post on how to avoid some performance issues that could have a huge impact.EDIT: Downvoted for expressing appreciation for someone taking the time to note lessons learned?.. OK.

fcolas超过 5 年前

- How did you guys scale that much w/o a bootloader before?That's what I don't get. All the design patterns are those of Unix. You boot the kernel with a ... bootloader. Then you've the kernel with all the system's params (call it ECS). Then each process is a child of the root process. And when you get by whatever mean the news that your app's source code has changed, you pull that code and start running it, while still having the old one live. Once the fork of the new code returns a proper response code, you kill the old one and set the new app live, otherwise you stay live with the old version.

swiftcoder超过 5 年前

> Engineers would spend at least 30 minutes building, deploying, and monitoring their changes through multiple staging and production environments, which consumed a lot of valuable engineering timeMan, startups have no idea how good they have it. It took a solid week to deploy a change at AWS.

maerF0x0超过 5 年前

> The rate at which we can start tasks restricts the parallelism of our deploy. Despite us setting the MaximumPercent parameter to 200%, the ECS start-task API call has a hard limit of 10 tasks per call, and it is rate-limited. We need to call it 400 times to place all our containers in production.From reading other comments it makes me wonder if you (Plaid) tried batching the tasks into N containers? Like if a task 50 containers, then you'd reduce the task call rate limiting by 50x...

评论 #20823042 未加载

crb002超过 5 年前

Google "checkpoint restart". HPC community has had these tools for years, many in userspace. Can't wait to see a Java or C# shop doing the same hot boots.

评论 #20829186 未加载

bsaul超过 5 年前

Side question : what’s the current best practice for ensuring that a server ( node or anything) isn’t currently processing some important information before you shut it down ?Is it a mix of waiting for request handlers to terminate upon receiving a sigterm then end the current process (and timeouting after a while) ? Does kubernetes handles those kind of things (waiting for a given process to stop before trashing the vm) or is there another layer or tool to do so ?

评论 #20823047 未加载

评论 #20822704 未加载

评论 #20822564 未加载

评论 #20822684 未加载

cagataygurturk超过 5 年前

Going to EKS would take less time than exploring hacks.

评论 #20822128 未加载

evantahler超过 5 年前

Pretty cool! Actionhero uses the ‘require cache’ trick in development mode to hot-reload your changes as you go. It’s risky in that even though you’ve change the required file, you may not have recreated all you objects again. For that reason Actinhero doesn’t allow this is NodeEnv is anything besides development.

evantahler超过 5 年前

Cool! I’m curious if this is something that nodemon/pm2 could do as task runners. You could call “npm update” and then hup your process...This is sort of how Capistrano handled deployments, changing a symlink to all project deps and then signaling the process to reload

shay_ker超过 5 年前

After all these years, how is deploying solely on AWS still worse than Heroku & Render?

mylampisawesome超过 5 年前

Just FYI, you're "We're Hiring!" link is broken.

15 条评论

nrmitchi超过 5 年前

评论 #20822536 未加载

benologist超过 5 年前

Whenever I see these posts I feel like Heroku narrowly missed out on shaping the rest of the cloud just by staying proprietary and expensive.

评论 #20824473 未加载

评论 #20822392 未加载

marcinzm超过 5 年前

My read seems to be: don't use ECS at large scale or you'll need some really convoluted hacks.

评论 #20822465 未加载

评论 #20821913 未加载

评论 #20822268 未加载

评论 #20822087 未加载

testuser5559191超过 5 年前

评论 #20824326 未加载

sailfast超过 5 年前

fcolas超过 5 年前

swiftcoder超过 5 年前

maerF0x0超过 5 年前

评论 #20823042 未加载

crb002超过 5 年前

Google "checkpoint restart". HPC community has had these tools for years, many in userspace. Can't wait to see a Java or C# shop doing the same hot boots.

评论 #20829186 未加载

bsaul超过 5 年前

评论 #20823047 未加载

评论 #20822704 未加载

评论 #20822564 未加载

评论 #20822684 未加载

cagataygurturk超过 5 年前

Going to EKS would take less time than exploring hacks.

评论 #20822128 未加载

evantahler超过 5 年前

shay_ker超过 5 年前

After all these years, how is deploying solely on AWS still worse than Heroku & Render?

mylampisawesome超过 5 年前

Just FYI, you're "We're Hiring!" link is broken.

How we reduced deployment times by 95%

15 条评论

How we reduced deployment times by 95%

15 条评论