AWS Tips I Wish I'd Known Before I Started

606 pointsby richadamsover 11 years ago

27 comments

rkallaover 11 years ago

Fantastic list with much more depth than I expected. Some surprises that others might be interested in from this article and comments below:<pre><code> [1] Keeping buckets locked down and allowing direct client -> S3 uploads [2] Using ALIAS records for easier redirection to core AWS resources instead of CNAMES. [3] What's an ALIAS? [-] Using IAM Roles [4] Benefits of using a VPC [-] Use '-' instead of '.' in S3 bucket names that will be accessed via HTTPS. [-] Automatic security auditing (damn, entire section was eye-opening) [-] Disable SSH in security groups to force you to get automation right. </code></pre> [1] <a href="http://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlUploadObject.html" rel="nofollow">http://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlU...</a>[2] <a href="http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/CreatingAliasRRSets.html" rel="nofollow">http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/Cre...</a>[3] <a href="http://blog.dnsimple.com/2011/11/introducing-alias-record/" rel="nofollow">http://blog.dnsimple.com/2011/11/introducing-alias-record/</a>[4] <a href="http://www.youtube.com/watch?v=Zd5hsL-JNY4" rel="nofollow">http://www.youtube.com/watch?v=Zd5hsL-JNY4</a>

评论 #7180485 未加载

mbreeseover 11 years ago

I'd also add to the list - make sure that AWS is right for your workload.If you don't have an elastic workload and are keeping all of your servers online 24/7, then you should investigate dedicated hardware from another provider. AWS really only makes sense ($$) when you can take advantage of the ability to spin up and spin down your instances as needed.

评论 #7173740 未加载

评论 #7173291 未加载

评论 #7173015 未加载

评论 #7173750 未加载

Judsonover 11 years ago

One thing the article mentions is terminating SSL on your ELB. If you want more control over your SSL setup AND want to get remote IP information (e.g. X-Forwarded-For) ELB now supports PROXY protocol. I wrote a little introduction on how to set it up[0]. They haven't promoted it very much, but it is quite useful.[0]: <a href="http://jud.me/post/65621015920/hardened-ssl-ciphers-using-aws-elb-and-haproxy" rel="nofollow">http://jud.me/post/65621015920/hardened-ssl-ciphers-using-aw...</a>

评论 #7173776 未加载

mslotover 11 years ago

Be very careful with assigning IAM roles to EC2 instances. Many web applications have some kind of implicit proxying, e.g. a function to download an image from a user-defined URL. You might have remembered to block 127.0.0.*, but did you remember 169.254.169.254? Are you aware why 169.254.169.254 is relevant to IAM roles? Did you consider hostnames pointed to to 169.254.169.254? Did you consider that your HTTP client might do a separate DNS look-up? etc.There are other subtleties which make roles hard to work with. The same policies can have different effects for roles and users (e.g., permission to copy from other buckets).IAM Roles can be useful, especially for bootstrapping (e.g. retrieving an encrypted key store at start-up), but only use them if you know what you're doing.Conversely, tips like disabling SSH have negligible security benefit if you're using the default EC2 setup (private key-based login). It's really quite useful to see what's going on in an individual server when you're developing a service.Also, it does matter whether you put a CDN in front of S3. Even when requesting a file from EC2, CloudFront is typically an order of magnitude faster than S3. Even when using the website endpoint, S3 is not designed for web sites and will serve 500s relatively frequently, and does not scale instantly.

评论 #7173825 未加载

评论 #7180435 未加载

Fizzerover 11 years ago

> you pay the much cheaper CloudFront outbound bandwidth costs, instead of the S3 outbound bandwidth costs.What? CloudFront bandwidth costs are, at best, the same as S3 outbound costs, and at worse much more expensive.S3 outbound costs are 12 cents per GB worldwide. [1]CloudFont outbound costs are 12-25 cents per GB, depending on the region. [2]Not only that, but your cost-per-request on CloudFront way more than S3 ($0.004 per 10,000 requests on S3 vs $0.0075-$0.0160 per 10,000 requests on CloudFront)[1] <a href="http://aws.amazon.com/s3/pricing/" rel="nofollow">http://aws.amazon.com/s3/pricing/</a> [2] <a href="http://aws.amazon.com/cloudfront/pricing/" rel="nofollow">http://aws.amazon.com/cloudfront/pricing/</a>

评论 #7178278 未加载

krallinover 11 years ago

Lots of very useful tips there!There's one that I think could be improved on a little:<pre><code> Uploads should go direct to S3 (don't store on local filesystem and have another process move to S3 for example). </code></pre> You could even use a temporary URL[0,1] and have the user upload directly to S3![0]: <a href="http://stackoverflow.com/questions/10044151/how-to-generate-a-temporary-url-to-upload-file-to-amazon-s3-with-boto-library" rel="nofollow">http://stackoverflow.com/questions/10044151/how-to-generate-...</a> [1]: <a href="http://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlUploadObject.html" rel="nofollow">http://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlU...</a>

评论 #7172755 未加载

评论 #7175457 未加载

评论 #7174112 未加载

评论 #7172701 未加载

j-kiddover 11 years ago

Good article, but I think it touches too little about persistence. The trade-off of EBS vs ephemeral storage, for example, is not mentioned at all.Getting your application server up and running is the easiest part in operation, whether you do it by hand via SSH, or automate and autoscale everything with ansible/chef/puppet/salt/whatever. Persistence is the hard part.

评论 #7173888 未加载

评论 #7173982 未加载

PhilipAover 11 years ago

Really useful article, though I don't agree with not using a CDN instead of S3. There are multiple articles which proves the performance of S3 being quite bad, and not useful for serving assets, comparing to CloudFront.

评论 #7181393 未加载

评论 #7174307 未加载

评论 #7173008 未加载

drobover 11 years ago

Along these lines, I recommend installing New Relic server monitoring on all your EC2 instances.The server-level monitoring is free, and it's super simple to install. (The code we use to roll it out via ansible: <a href="https://gist.github.com/drob/8790246" rel="nofollow">https://gist.github.com/drob/8790246</a>)You get 24 hours of historical data and a nice webUI. Totally worth the effort.

matchover 11 years ago

<pre><code> > Use random strings at the start of your keys. > This seems like a strange idea, but one of the implementation details > of S3 is that Amazon use the object key to determine where a file is physically > placed in S3. So files with the same prefix might end up on the same hard disk > for example. By randomising your key prefixes, you end up with a better distribution > of your object files. (Source: S3 Performance Tips & Tricks) </code></pre> This is great advice, but just a small conceptual correction. The prefix doesn't control where the file contents will be stored it just controls where the index to that file's contents is stored.

lfullerover 11 years ago

Your body tag is set to "overflow: hidden;". I wasn't able to scroll until I tweaked it manually in the inspector.

评论 #7172540 未加载

kolevover 11 years ago

One painful to learn issue with AWS is the limits of services, which some of them are not so obvious. Everything has a hard limit and unless you have the support plan, it can take you days and weeks to get those lifted. They are all handled by the respective departments and lifted (or rejected) one by one. Many times we've encountered a Security Group limit right before a production push or other similar things. Last, but not least, RDS and CloudFront are extremely painful to launch. I have many incidents where RDS was taking nearly 2 hours to launch - blank multi-AZ instance! CloudFront distributions take 30 minutes to complete. I hate those two taking so long as my CloudFormation templates pretty much take an excess of an hour due to the blocking RDS and CloudFront. Last, but not least - VPC is nice, I love it, but it takes time to get what's the difference between Network ACL and Security groups and especially - why the neck do you need to run NATs?! Why isn't this part of the service?! They provide some outdated "high" availability scripts, which are, in fact, buggy, and support only 2 AZs. Also, a CloudFront "flush" takes over 20 minutes - even for empty distributions! Also, you can't do a hot switch from on distribution to another as it also take 30 minutes to change a CNAME and you cannot have two distributions having the same CNAME (it's a weird edge case scenario, but anyway).

评论 #7181426 未加载

noelherrickover 11 years ago

> Have tools to view application logs.Yes! Centralized logging is an absolute must: don't depend on the fact that you can log in and look at logs. This will grow so wearisome.

评论 #7175257 未加载

Mizzaover 11 years ago

That '.' instead of '-' tip for SSL'd buckets just saved me a large future headache. Good stuff!

评论 #7173737 未加载

michaelmiorover 11 years ago

Disabling SSH is an interesting tip. I guess the OP doesn't do any automation via SSH.

评论 #7174291 未加载

评论 #7180427 未加载

novaleafover 11 years ago

i'm a devops noob. what tools should i use to log / monitor all my servers?i don't want to learn some complex stuff like cheff/puppet btw.... anything SIMPLE?

评论 #7175302 未加载

freerobbyover 11 years ago

Can you (or somebody else) elaborate on disabling ssh access? Is this a dogma of "automation should do everything" or is there a specific security concern you are worried about? What is the downside of letting your ops people ssh into boxes, or for that matter of their needing to do so?

评论 #7173432 未加载

评论 #7180500 未加载

评论 #7176371 未加载

Estragonover 11 years ago

How hard is it to roll your own version of AWS's security groups? I want to set up a Storm cluster, but the methods I have come up with for firewalling it while preserving elasticity all seem a bit fragile.

评论 #7180412 未加载

mblaneyover 11 years ago

As an Australian developer, using an EC2 instance seems to be the cheapest option if you want a server based in this country. Anyone got any other recommendations?

评论 #7175519 未加载

simonleboover 11 years ago

Can anyone explain how disabling ssh has anything to do with automation? We automate all our deployments through ssh and I was not aware of another way of doing.

评论 #7174425 未加载

评论 #7174181 未加载

rdlover 11 years ago

I'd probably also say "avoid ELB where possible, especially for instance storage" and "avoid ELB, roll your own."

late2partover 11 years ago

Thing I wish I'd known before I started: Don't rely on proprietary AWS solutions when open source solutions work just as well.

jamiesonbeckerover 11 years ago

With regards to managing ssh, keys, etc... userify. Disclaimer: founder.

gesmanover 11 years ago

Someone needs to create such list for Azure as well.And make it Wiki-ized.

ape4over 11 years ago

Wow looks like a big pain.

Fasebookover 11 years ago

What's the point of auditing security in the Cloud? Is there any point at which you can know that your making any progress?

评论 #7173487 未加载

5ersiover 11 years ago

Aww man, my head hurts just looking at this list.Just go with a PaaS, like Heroku or AppEngine, and forget about this sysadmin crap.

评论 #7173176 未加载