TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: How to run a rails web crawler on aws

3 pointsby wenqin123about 10 years ago
Could someone be nice enough to point me in the right direction? I&#x27;ve got my web crawler working locally but now I want to put it onto an ec2 instance; however, I have no idea where to start.<p>Thanks!

1 comment

pjungwirabout 10 years ago
Setting up Rails projects to run in production is my specialty. :-) Some people love it, some hate it, but it&#x27;s a good chance to learn some Linux and sysadmin skills.<p>I&#x27;d advise you to do it sloppy the first time. Don&#x27;t worry about configuration management, containers, and all that stuff, just learn how to set up the system. So things you&#x27;ll have to do:<p>- Set up a `deployer` user account. (Actually I like to name these after the application they&#x27;re responsible for.)<p>- Install rbenv and your Ruby of choice.<p>- Install nginx or Apache.<p>- Install a Rails app server. I like Unicorn, but if you go with Phusion Passenger there&#x27;s not a separate process to manage---it just launches via your web server.<p>- Install your database and give it some initial contents.<p>- Add Capistrano to your Gemfile, write a cap script, get it working.<p>- Since you say this is a web crawler I assume background jobs are important. So if you&#x27;re using Resque or Sidekiq you&#x27;ll need to install Redis.<p>For bonus points:<p>- If you have cron jobs, use `whenever` so they get configured every time you run `cap`.<p>- Install SSL.<p>- Install fail2ban.<p>- Use unicorn instead of passenger.<p>- Use a process manager like god (which is ruby-based) to control unicorn and your background jobs. Btw if you do this, your cap tasks for restarting unicorn, delayed_job, etc. should change to just `sudo god restart myapp-unicorn`, not the direct commands.<p>- Use chef-solo (also ruby-based) so you don&#x27;t have to do it all by hand next time. :-) Tip: to save time, run this against a vagrant VM until you get the bugs out.<p>- Start playing with more bits of the AWS ecosystem, e.g. add an ELB. Run everything in a VPC instead of &quot;EC2 Classic&quot;. Use CloudFormation to launch instances and kick off Chef automatically (using chef-server instead of chef-solo)---or try out OpsWorks, but IMO it&#x27;s a bit beta. Use RDS.<p>Good luck, and have fun!
评论 #9567967 未加载