Managing two million web servers

388 pointsby timfabout 9 years ago

16 comments

klibertpabout 9 years ago

OMG, guys, this is getting really strange. Half of the commenters here read the word "process" and jumped to their own conclusions, possibly true in general, but obviously wrong in the case of Erlang.It bears repeating: Erlang processes are not OS-level processes. Erlang Virtual Machine, BEAM, runs in a single OS-level process. Erlang processes are closer to green-threads or Tasklets as known in Stackless Python. They are extremely lightweight, implicitly scheduled user-space tasks, which share no memory. Erlang schedules its processes on a pool of OS-level threads for optimal utilization of CPU cores, but this is an implementation detail. What's important is that Erlang processes are providing isolation in terms of memory used and error handling, just like OS-level processes. Conceptually both kinds of processes are very similar, but their implementations are nothing alike.

评论 #11282695 未加载

评论 #11298780 未加载

frikabout 9 years ago

With the same speak, you could say Facebook mangages billions of PHP web servers, though no one speak like that. (PHP has a shared nothing architecture; HHVM works simlar to the Erlang VM, if one can say so)

评论 #11281786 未加载

评论 #11281957 未加载

stephen_mcdabout 9 years ago

I really love the idea of explaining the actor model as tons of tiny little servers compared to a single monolithic server. I tried to make the same comparison recently when I talked about adding distributed transactions to CurioDB (Redis clone built with Scala/Akka): <a href="http://blog.jupo.org/2016/01/28/distributed-transactions-in-actor-systems/" rel="nofollow">http://blog.jupo.org/2016/01/28/distributed-transactions-in-...</a>

评论 #11281528 未加载

mpweiherabout 9 years ago

Beautiful way of putting it. Also very close to Alan Kay's vision of "object oriented""In computer terms, Smalltalk is a recursion on the notion of computer itself. Instead of dividing “computer stuff” into things each less strong than the whole – like data structures, procedures, and functions which are the usual paraphernalia of programming languages – each Smalltalk object is a recursion on the entire possibilities of the computer. Thus its semantics are a bit like having thousands and thousands of computer all hooked together by a very fast network." -- The Early History of Smalltalk [1]I also personally like the following: a web package tracker can be seen as a function that returns the status of a package when given the package id as argument. It can also be seen as follows: every package has its own website.I think the latter is vastly simpler/more powerful/scalable.What's interesting is that both of these views can exist simultaneously, both on the implementation and on the interface side.[1] <a href="http://gagne.homedns.org/~tgagne/contrib/EarlyHistoryST.html" rel="nofollow">http://gagne.homedns.org/~tgagne/contrib/EarlyHistoryST.html</a>

评论 #11282594 未加载

andy_pppabout 9 years ago

I get the feeling that people reading this and saying "it's just kind of like a pool of PHP FastCGI instances or Apache worker pools" etc. Do not understand that Phoenix + Elixir can serve minimal dynamic requests about 20% slower than nginx can serve static files. This is very very fast.It also leads to better code due to being functional, lots of amazing syntactic sugar like the |> operator and the OTP can easily allow you to move processes (these basically have very little overhead) to different machines as you wish to scale. Pattern matching and guards are also incredible.I really do not want to write anything else!

评论 #11282501 未加载

akkartikabout 9 years ago

This article got me to go figure out precisely what Erlang processes are. Tl;dr - they aren't OS processes. So it is still conceivable that an error in Erlang can bring down all your web servers.<a href="http://stackoverflow.com/questions/2708033/technically-why-are-processes-in-erlang-more-efficient-than-os-threads" rel="nofollow">http://stackoverflow.com/questions/2708033/technically-why-a...</a>

评论 #11281288 未加载

评论 #11281375 未加载

评论 #11281002 未加载

评论 #11280912 未加载

评论 #11280379 未加载

rodionosabout 9 years ago

The title is somewhat misleading. I clicked expecting to read how someone is managing 2 mln web server instances such nginx or apache. I was curious what kind of company would claim that.

评论 #11281453 未加载

评论 #11281591 未加载

DougWebbabout 9 years ago

In the late 90s I implemented the same concept for a web application written in Perl. (It's still running today.) There were three tiers to it:Tier 1: a very small master program which ran in one process. It's job was to open the listening socket and maintain a pool of connection handler processes.Tier 2: connection handler processes, forked from the master program. When they started they would load up the web application code, then wait for connections on the listening socket or for messages from the master process. They also monitored their own health and would terminate if they thought something went wrong. (ex: this protected them from memory leaks in the socket handling code.) When an http connection came in on the socket, they would fork off a process to handle the request.Tier 3: request handlers. These processes would handle one http request and then terminate. When they started, they had a pristine copy of the web application code (thanks to Copy-On-Write memory sharing of forked processes) so I knew that there was no old data leaked from previous requests. And since they were designed to terminate after a single request, error handling was no problem; those would terminate too. In cases where a process consumed a lot of memory it would get released to the OS when the process ended. We also had a separate watchdog process that would kill any request handler that consumed too much cpu, memory, or was running much longer than our typical response time.This scaled up to handling hundreds of concurrent requests per (circa 2005 Solaris) server, and around six million requests per day across a web farm of 4 servers. That was back in 2010; I don't know how much the traffic has grown since then but I know the company is still running my web app. This was all very robust; before I left I had gotten the error rate down to a handful of crashed processes per year in code that was more than one release old.BTW, while my custom http server code could handle the entire app on its own, and was used that way in development, for production we normally ran it behind an Apache server that handled static files and would reverse-proxy the page requests to the web app server. So those 6 million requests per day were for the dynamic pages, not all of the static files. That also meant that my web app didn't have to handle caching or keep-alive, which simplified the design and makes the one-request-then-die approach more viable.

z3t4about 9 years ago

I would like to see the code for the chat or presence server. I have a hunch it will look different depending on the experience of the programmer.I'm especially interested in how they manage state. Because when you do not have to manage state, everything becomes easy and scalable. With state I mean for example a status message for a particular user.

评论 #11281627 未加载

评论 #11281596 未加载

kennydudeabout 9 years ago

> why does the Phoenix Framework outperform Ruby on Rails?Ruby is known to be a slow language. Most things will easily outperform it

评论 #11282263 未加载

smailiabout 9 years ago

Could someone explain how in the context of the article, "process" differs from a "thread", in say Java or Python? Or are they one in the same?

评论 #11280969 未加载

评论 #11280818 未加载

评论 #11281525 未加载

评论 #11280815 未加载

评论 #11281033 未加载

评论 #11281355 未加载

sisciaabout 9 years ago

A little while ago I wrote an extremely short introduction to distributed, highly scalable, fault tolerant system.It is marketing material for my consulting activity anyway some of you can find it interesting.The PDF is here: <a href="https://github.com/siscia/intro-to-distributed-system/blob/master/intro_to_distributed.pdf" rel="nofollow">https://github.com/siscia/intro-to-distributed-system/blob/m...</a>The source code is open, so if you find a better way to describe things feel free to open an issue or a pull request...

rasenganabout 9 years ago

I can't help but think this is madly in-efficient with cache misses and the like.

评论 #11281247 未加载

评论 #11281248 未加载

评论 #11283279 未加载

sandra_saltlakeabout 9 years ago

processes do not share memory, but threads may be,

评论 #11281432 未加载

yandrypozoabout 9 years ago

does anybody know how to undo an upvote here in HN ?? This reading was a terrible waste of time :(

评论 #11283472 未加载

jonduboisabout 9 years ago

You don't need to "crash the server" in response to an error from a single user - It is sufficient to just close the connection and destroy the session.I doubt that erlang spawns millions of OS processes because that would be extremely inefficient due to CPU context switching. So in reality, all erlang is doing behind the scenes is closing the connection and destroying the session... It's not actually crashing and restarting any processes... You can easily implement this behavior with pretty much any modern server engine such as Node.js and tornado.

评论 #11280687 未加载

评论 #11280689 未加载

评论 #11280581 未加载

评论 #11280640 未加载

评论 #11281364 未加载

评论 #11281099 未加载

评论 #11280526 未加载