Lots of pets here, not many cattle. When you get to this scale, tools like Kubernetes make more sense. Then you can just think about your system in terms of "how much cpu/ram/storage do I need total?" Unless every one of those servers is running at or near capacity, there is a lot of cash being wasted there. There is also a maintenance cost, too. If one of these vital boxes goes down, what is the downtime implication and restoration cost?<p>I am not saying any of this to be critical of Lichess. There are different ways to solve these problems, and their way is clearly working. This also happens slowly over many years, so it is hard/impossible to see the end state until you are there. The app is very quick and responsive. I got my ass handed to me on my first anon game :) My feedback is more for the community here in the context of using this as a byte-sized case study.<p>At the end of the day we are reading and writing 1's and 0's to a network device, or a disk. Have to imagine you can run and persist chess games with a lot less resources.