Could someone be nice enough to point me in the right direction? I've got my web crawler working locally but now I want to put it onto an ec2 instance; however, I have no idea where to start.<p>Thanks!
Setting up Rails projects to run in production is my specialty. :-) Some people love it, some hate it, but it's a good chance to learn some Linux and sysadmin skills.<p>I'd advise you to do it sloppy the first time. Don't worry about configuration management, containers, and all that stuff, just learn how to set up the system. So things you'll have to do:<p>- Set up a `deployer` user account. (Actually I like to name these after the application they're responsible for.)<p>- Install rbenv and your Ruby of choice.<p>- Install nginx or Apache.<p>- Install a Rails app server. I like Unicorn, but if you go with Phusion Passenger there's not a separate process
to manage---it just launches via your web server.<p>- Install your database and give it some initial contents.<p>- Add Capistrano to your Gemfile, write a cap script, get it working.<p>- Since you say this is a web crawler I assume background jobs are important. So if you're using Resque or Sidekiq you'll need to install Redis.<p>For bonus points:<p>- If you have cron jobs, use `whenever` so they get configured every time you run `cap`.<p>- Install SSL.<p>- Install fail2ban.<p>- Use unicorn instead of passenger.<p>- Use a process manager like god (which is ruby-based) to control unicorn and your background jobs. Btw if you do this, your cap tasks for restarting unicorn, delayed_job, etc. should change to just `sudo god restart myapp-unicorn`, not the direct commands.<p>- Use chef-solo (also ruby-based) so you don't have to do it all by hand next time. :-) Tip: to save time, run this against a vagrant VM until you get the bugs out.<p>- Start playing with more bits of the AWS ecosystem, e.g. add an ELB. Run everything in a VPC instead of "EC2 Classic". Use CloudFormation to launch instances and kick off Chef automatically (using chef-server instead of chef-solo)---or try out OpsWorks, but IMO it's a bit beta. Use RDS.<p>Good luck, and have fun!