TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

The Case for Building Scalable Stateful Services

139 点作者 aarkay超过 9 年前

6 条评论

packetslave超过 9 年前
From a talk by Caitie McCaffrey of Twitter, at the Strange Loop conference.<p>Video: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=H0i_bXKwujQ" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=H0i_bXKwujQ</a><p>Slides: <a href="https:&#x2F;&#x2F;speakerdeck.com&#x2F;caitiem20&#x2F;building-scalable-stateful-services" rel="nofollow">https:&#x2F;&#x2F;speakerdeck.com&#x2F;caitiem20&#x2F;building-scalable-stateful...</a>
themartorana超过 9 年前
Stateful: you have one web server!<p>Stateless: you grow to require tens of servers or more, horizontal scalability is much cheaper than vertical, look to software solutions to help slow expenses, move to NoSQL clustered DBs like Riak, Casandra, Hadoop, etc. 1-2 engineers can still run the whole show, cloud services, SaaS and PaaS are employed.<p>Stateful: you run thousands of servers, having since brought many services back in-house. Many if not most are your own metal, with dedicated staff. Looking to slow power bills and space requirements, you look once again at software solutions.<p>If you stay at the same growing company long enough, what&#x27;s old will be new again.
评论 #10378616 未加载
评论 #10379657 未加载
jakozaur超过 9 年前
Thumb rule, if you design service with many servers you have following options:<p>1. Have a stateless service. You can update it frequently with no downtime... Relatively easy.<p>2. Use some of the shelf service that provides states and you don&#x27;t need to update that frequently (e.g. ElastiCache, Cassandra, ....). Relatively easy.<p>3. Write your own stateful service. For some applications it is a must (e.g. you do your own search service, data processing, game collision engine). Need to take care of state transition during restarts&#x2F;upgrades, client routing is also tricky. Hard, but sometimes there is no way around to build efficient infrastructure.<p>4. Don&#x27;t think about state and you may end up crying after your code hits the prod.
rdtsc超过 9 年前
A comparison with Erlang&#x2F;OTP:<p><a href="http:&#x2F;&#x2F;christophermeiklejohn.com&#x2F;papers&#x2F;2015&#x2F;05&#x2F;03&#x2F;orleans.html" rel="nofollow">http:&#x2F;&#x2F;christophermeiklejohn.com&#x2F;papers&#x2F;2015&#x2F;05&#x2F;03&#x2F;orleans.h...</a>
deathtrader666超过 9 年前
Hasn&#x27;t Erlang &amp; OTP already solved this?
评论 #10379376 未加载
EGreg超过 9 年前
I think that, in general, anything that has no persistence can be shared-nothing. State in shared-nothing consists basically of a cache which is updated by subscribing to changes in the data store and being updated, with only a slight lag.<p>Shared-nothing can include environments like user agents, proxies and web servers.<p>As for the persistence layer &#x2F; data store, it should support horizontal partitioning. Especially useful is range-based partitioning based on a primary key whose prefix contains a Geohash ... because then you can route requests to the closest Region on AWS or some other host.<p>If one of your shards gets too large you can split it into two or more shards. All the monitoring and splitting can be automated with dev ops in the cloud to provision machines etc. so you don&#x27;t need to wake up at 3am.<p>With this setup you can reliably grow your data store to an arbitrary size, and literally have only O(log n) growth in latency for any request. However there is one more issue to solve:<p>When you need to perform database queries that return a cross product, or join, do you compute it on the fly for the request (eg with mapreduce) or do you precompute the result whenever a row is inserted into one of the joined tables? The second way can be done in the background and uses memory-time tradeoff to cause the queries to be O(1). This can be really useful for queries that need to get the answer in realtime.<p>I would recommend using evented (eg Node.js) servers for queries that involve hitting multiple shards at the same time, or mapreduce type things. Evented I&#x2F;O lets you wait only as long as the longest query.<p>Finally, I don&#x27;t think things like socket.io will be horizontally partitionable easily, eg to a node cluster, so you&#x27;ll probably want to have server affinity on a per-room basis.