TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Using Amazon Auto Scaling with Stateful Applications

58 pointsby amagnusalmost 9 years ago

4 comments

jcritesalmost 9 years ago
Glad to see writeups and lessons learned about AutoScaling. As a system builder, I think it&#x27;s a capability that&#x27;s under-appreciated. My rule of thumb is that essentially every machine fleet benefits from being managed as an AutoScaling Group (whether stateless or not - there&#x27;s value even without dynamic scaling).<p>&gt; If you’re working with a queue based model then scaling will be done based on the SQS queue size, otherwise we’ll use the custom metric “number of running jobs”.<p>There is another strategy to consider when auto-scaling queue-based applications: if you have a fixed number of threads per machine instance that process work from a queue, then you can scale based on the percentage of threads occupied.<p>You can treat this as a utilization percentage similar to CPU Utilization, the observation being that you can begin scaling up in advance of ever actually developing a (long) queue. For example, consider an application where the average queue size is zero, and the average thread utilization is zero. Consider another application where the average queue size is zero, because messages are processed almost immediately, but 19 out of 20 threads are occupied on average. You can conclude that the first application is nearly idle, while the second is nearly maxed out, even though both of them have empty queues. By considering the second application to be (19&#x2F;20=) 95% utilized, you can establish a scaling policy that scales up before a backlog develops if you wish -- this is assuming that you wish to avoid developing a long queue, which is desirable in some cases. It depends on how quickly you&#x27;d like to process messages - SLA.<p>(The article touches on this as well, talking about number of running jobs.)<p>Queue size can be useful as well, but I think it can be more difficult to tune. Percent of thread capacity works well regardless of how large your fleet is, and how expensive messages are to process. By comparison, a large-scale system that processes thousands of messages per second could develop a large queue -- in terms of number of items -- from a brief blip, which it will burn through momentarily. A 20,000 item queue might be nothing to such a system, whereas for another system, 100 items could be significant, if each one is a 10GB video to download and transcode.<p>The ideal auto-scaling solution typically involves a mix of multiple measurement techniques, since it&#x27;s rare that any single performance characteristic captures an application&#x27;s load perfectly. I would definitely agree with the author&#x27;s point that instrumentation is highly valuable.
评论 #11842273 未加载
评论 #11865466 未加载
评论 #11839393 未加载
stretchwithmealmost 9 years ago
You&#x27;ll want to make sure you don&#x27;t autoscale just because a shared resource like a database is not working correctly. There&#x27;s no point in having more instances up if the new ones don&#x27;t work either.<p>Elastic Beanstalk can be very useful, especially for deploying new versions of your app. It has a CLI that makes it easy to SSH to all of your instances as well. And application environments can be configured to save its logs to an S3 bucket. And it has support for time-based scaling events, so you can have more hosts up during the day if that&#x27;s what you need.
评论 #11865585 未加载
jjeaffalmost 9 years ago
I would be curious what variance of load would be required for this type of auto scaling to be worth the time vs just putting your own bare metal in a colocation facility or leasing dedicated boxes.<p>The price performance is so much higher on dedicated. Or for that matter, using reserved instances instead of scaling with spot prices.
评论 #11865566 未加载
stephengilliealmost 9 years ago
Stateful applications should use horizonal autoscale to hot-add CPUs. While this has a lower limit than vertical autoscale, the two can be combined with sticky persistence to maintain state at scale.