This post reminds me of an experience I had in ~2005 while @ Hostway Chicago.<p>Unsolicited story time:<p>Prior to my joining the company Hostway had transitioned from handling all email in a dispersed fashion across shared hosting Linux boxes with sendmail et al, to a centralized "cluster" having disparate horizontally-scaled slices of edge-SMTP servers, delivery servers, POP3 servers, IMAP servers, and spam scanners. That seemed to be their scaling plan anyways.<p>In the middle of this cluster sat a refrigerator sized EMC fileserver for storing the Maildirs. I forget the exact model, but it was quite expensive and exotic for the time, especially for an otherwise run of the mill commodity-PC based hosting company. It was a big shiny expensive black box, and everyone involved seemed to assume it would Just Work and they could keep adding more edge-SMTP/POP/IMAP or delivery servers if those respective services became resource constrained.<p>At some point a pile of additional customers were migrated into this cluster, through an acquisition if memory serves, and things started getting slow/unstable. So they go add more machines to the cluster, and the situation just gets worse.<p>Eventually it got to where every Monday was known as Monday Morning Mail Madness, because all weekend nobody would read their mail. Then come Monday, there's this big accumulation of new unread messages that now needs to be downloaded and either archived or deleted.<p>The more servers they added the more NFS clients they added, and this just increased the ops/sec experienced at the EMC. Instead of improving things they were basically DDoSing their overpriced NFS server by trying to shove more iops down its throat at once.<p>Furthermore, by executing delivery and POP3+IMAP services on separate machines, they were preventing any sharing of buffer caches across these embarrassingly cache-friendly when colocated services. When the delivery servers wrote emails through to the EMC, the emails were also hanging around locally in RAM, and these machines had several gigabytes of RAM - only to <i>never</i> be read from. Then when customers would check their mail, the POP3/IMAP servers <i>always</i> needed to hit the EMC to access new messages, data that was <i>probably</i> sitting uselessly in a delivery server's RAM somewhere.<p>None of this was under my team's purview at the time, but when the castle is burning down every Monday, it becomes an all hands on deck situation.<p>When I ran the rough numbers of what was actually being performed in terms of the amount of real data being delivered and retrieved, it was a trivial amount for a moderately beefy PC to handle at the time.<p>So it seemed like the obvious thing to do was simply colocate the primary services accessing the EMC so they could actually profit from the buffer cache, and shut off most of the cluster. At the time this was POP3 and delivery (smtpd), luckily IMAP hadn't taken off yet.<p>The main barrier to doing this all with one machine was the amount of RAM required, because all the services were built upon classical UNIX style multi-process implementations (courier-pop and courier-smtp IIRC). So in essence the main reason most of this cluster existed was just to have enough RAM for running multiprocess POP and SMTP sessions.<p>What followed was a kamikaze-style developed-in-production conversion of courier-pop and courier-smtp to use pthreads instead of processes by yours truly. After a week or so of sleepless nights we had all the cluster's POP3 and delivery running on a single box with a hot spare. Within a month or so IIRC we had powered down most of the cluster, leaving just spam scanning and edge-SMTP stuff for horizontal scaling, since those didn't touch the EMC. Eventually even the EMC was powered down, in favor of drbd+nfs on more commodity linux boxes w/coraid.<p>According to my old notes it was a Dell 2850 w/8GB RAM we ended up with for the POP3+delivery server and identical hot spare, replacing <i>racks</i> of comparable machines just having less RAM. >300,000 email accounts.