TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

We run migrations across 2,800 microservices

29 点作者 willsewell9 个月前

6 条评论

HenryBemis9 个月前
Or as I call it &quot;death by a thousand microcuts&quot;.<p>I always wonder why some (most) banks are proud of being reckless.. oh well, it keeps me well paid.<p>Also, Monzo decided to remove the &quot;dark mode&quot; option back-in-the-day. When I wrote to them about it &quot;please return it as optional - as it already was&quot; they responded with a polite &quot;nope, suck it up&quot;. My next message to them was to close my account. Well.. &quot;nope, suck it up&quot; back right at you.
评论 #41367345 未加载
评论 #41367316 未加载
评论 #41367029 未加载
gsck9 个月前
I like Monzo as a bank, I think what they are doing is pretty cool.<p>But it all stills very amateur-ish, especially for a bank. Something as simple as being able to generate a proof of payment receipt for a bank transfer, why is this not possible? It feels incredibly unprofessional to send a screenshot of a mobile app to a company because your bank doesn&#x27;t allow you to properly export a PDF for one single transaction.
jjice9 个月前
What constitutes a micro service when you have 2800? Are these individual lambdas for each endpoint and background task or something?
评论 #41367246 未加载
评论 #41367218 未加载
评论 #41367324 未加载
willsewell9 个月前
There was previous discussion related to our microservices architecture here: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22725989">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22725989</a>.
lucianbr9 个月前
&gt; it would require a lot of effort to update all call sites, and in some cases the benefit of the new API was minimal. By wrapping the old library it meant we could choose to keep the interface similar to the old library in these cases, making it easier to update call sites.<p>Doesn&#x27;t wrapping the old library require a lot of effort to update all call sites?<p>If this is supposed to be general advice about libraries... does this mean wrap all libraries? Does not sound like a good idea to me.
评论 #41368230 未加载
0xbadcafebee9 个月前
The whole idea of (<i>reliably</i>) deploying and rolling back without downtime I don&#x27;t think gets nearly enough meme-worthy attention on HN. It&#x27;s quite complicated and depends entirely on a number of variables (specifically how you do <i>everything</i>). I wrote an internal paper once which was probably 30 pages just to explain why we couldn&#x27;t do automatic rollbacks.<p>The most important parts of such a system (the ones mentioned in this post, anyway) don&#x27;t get nearly enough attention:<p>- &quot;centrally driven migrations&quot;: In any distributed service architecture, there are always too many interdependent pieces. You can&#x27;t reliably touch thing A without also touching things B, C, D, etc. If you want any chance of automation or responding to failure without downtime, you must have a system which is aware of the changing state of everything and can change all the parts at a whim.<p>- &quot;database migrations&quot;: This is again very complicated and depends on how your code and database are architected. You literally can&#x27;t do migrations if your code and schema aren&#x27;t set up right, and if you don&#x27;t make the right kind of changes. How do you do this? Time to write a book...<p>- &quot;wrap the old library&quot;: I can&#x27;t remember what this is called, but it has a name. Anyway, the idea is hiding any change behind what is effectively a feature flag wrapper allows you to deploy the change without it being enabled, use the feature flag to test the change in production (on only one rest, on a percentage of requests, on one whole node&#x2F;pod, etc), and then delete the old code eventually. This isn&#x27;t just for features; you can replace entire interfaces, software stacks, whole systems this way, either piecemeal or entirely. Very powerful, but again, requires a specific approach not only in implementation but in use.<p>- &quot;use automated rollback checks&quot;: What kind of checks? Checking what? In what way? At what time&#x2F;stage? What happens when one fails? Do you do them in series or parallel? <i>Can you</i> do them in series or parallel? etc<p>- &quot;deploy least critical services first&quot;: With enough interdependent services, you&#x27;re going to hit cases where you <i>have to</i> upgrade parts B and C effectively simultaneously before you can upgrade A, etc. So for &quot;no downtime&quot;, it will take a lot of coordination, and very explicit linkage and checking of specific new services, etc. There are ways to do this, but it&#x27;s specific to your implementation and services, so this is another example of how you have to know exactly what&#x27;s going on, and then set up the deployment to account for your specific dependency tree and how they react when they&#x27;re run.<p>So many people I&#x27;ve run into don&#x27;t think about any of these things. They literally say things like &quot;automated rollbacks are easy, we did it at XYZ place&quot;, as if none of the above matter at all. They literally stick their head in the sand because they <i>want to believe</i> that it should be easy. But any engineer worth their salt will tell you that to do it correctly and reliably is bloody complicated.