TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

We built network isolation for 1500 services

135 pointsby p10jkleover 5 years ago

16 comments

d4ntover 5 years ago
Locking down this network of services is a massive security improvement and they&#x27;ve used some very neat ways of achieving it. Overall, I really appreciate them writing this up.<p>However, 1500 services? That really feels like they&#x27;re separating things at too granular a level. Does every one of those things really need to sit behind a network call? Couldn&#x27;t some of that re-use be via code libraries? I wonder what the service to developer ratio is?
评论 #21454752 未加载
评论 #21460606 未加载
all_usernamesover 5 years ago
Great post. I really appreciate engineering blogs written in this storytime format. I don&#x27;t have time to dive into the implementation of Calico or &lt;insert one of the 1,261 kubernetes projects here&gt;<i>, but I learn a lot from reading the process a team goes through in figuring out and iterating on a solution.<p></i> <a href="https:&#x2F;&#x2F;landscape.cncf.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;landscape.cncf.io&#x2F;</a>
评论 #21460417 未加载
sansnommeover 5 years ago
Another potential solution is to use a constraint solver like MSFT Z3, or if you want a nicer syntax and more flexibility, Prolog.<p>E.g. <a href="https:&#x2F;&#x2F;medium.com&#x2F;@ahelwer&#x2F;checking-firewall-equivalence-with-z3-c2efe5051c8f" rel="nofollow">https:&#x2F;&#x2F;medium.com&#x2F;@ahelwer&#x2F;checking-firewall-equivalence-wi...</a><p>This is much more scalable in the long run.
gravypodover 5 years ago
If the authors are reading this I was wondering two things:<p>1. Why was static analysis of the code chosen over observing the system during runtime and integration testing?<p>2. What was the reason rhe CNI layer was chosen for the implementation of this over the service mesh layer?<p>Something that really interests me about bazel&#x2F;buck&#x2F;pants&#x2F;please is it automates #1 entirely with dep queries.
评论 #21460987 未加载
评论 #21461069 未加载
z3t4over 5 years ago
Applying network filtering, while being a nice extra layer, it should not be the only layer. Services should need authorization like if it was an open api.
评论 #21461059 未加载
rawoke083600over 5 years ago
&quot;But we already have over 1,500&quot; wow... I would start there...
purple_ducksover 5 years ago
&gt; attempt to find code that looked like it was making a request to another service.<p>&gt; We generally fixed those cases by adding a special comment in the code that told rpcmap about the link<p>Why not enforce all endpoints&#x2F;urls be defined in a config file and sidestep this? - scanning code for URLs&#x2F;constructed URL is overkill and brittle.
grandinjover 5 years ago
Strikes me that some services ideally need to expose multiple interfaces, and that isolation should be on a per-service-interface basis.<p>E.g. the monitoring service should only be able to access the metrics part of each service.
评论 #21460967 未加载
aSplash0fDerpover 5 years ago
Nice write-up! Thats the beauty of scale, explain a part in detail, then go with the 30,000 foot view.<p>IMM, the security orchestration may actually become the &quot;app&quot; as speeds continue to increase, compute costs go even lower and losses incurred from compromised data&#x2F;networks increase.<p>A true zero trust platform that keeps all of the doors closed or &quot;instances&#x2F;vm&quot; offline until (the milliseconds) they&#x27;re needed is the security symphony we might see on the horizon.<p>Data silos and walled gardens may never go out of style, they&#x27;ll just take on new acronyms.
angry_octetover 5 years ago
Impressive achievement. It still sounds like callee&#x27;s have more knowledge of callers than is justified. Is it a security property or a component functionality property? How do those interact?<p>A centralised graph representation of the security&#x2F;functionality properties would be a better way to represent this information, so it can catch adding interfaces which should be forbidden. Also able to be configuration managed as sets of microservices.<p>If you have a connectivity graph it would be good to do taint analysis to see how far bad information can propagate.
评论 #21455806 未加载
matdehaastover 5 years ago
Curious if you looked at using oAuth with client credentials grant for each service?<p>Also didn&#x27;t see any mention of prior art like <a href="https:&#x2F;&#x2F;cloud.google.com&#x2F;beyondcorp&#x2F;" rel="nofollow">https:&#x2F;&#x2F;cloud.google.com&#x2F;beyondcorp&#x2F;</a>.<p>Thanks for the great writeup!
评论 #21461846 未加载
brentisover 5 years ago
Nice work. If you define your policies based on a tagging taxonomy you could centrally manage these inbound&#x2F;outbound service relationships. Every new instance or container would assume same network policies based on tag.
hu3over 5 years ago
&gt; This would read all the Go code in our platform, and attempt to find code that looked like it was making a request to another service.<p>Is there a link about how much Go does Monzo they use?
评论 #21459700 未加载
mschuster91over 5 years ago
1.500 services? What the... the run times for calls must be <i>atrocious</i> with all the network communication and latency that is happening.
评论 #21460919 未加载
评论 #21462079 未加载
voltarolinover 5 years ago
Can a service mesh such as Istio provide the capability that Monzo have implemented themselves here?
kasey_junkover 5 years ago
Using YAML for critical infrastructure specification is one of the stupidest things we’ve ever done as an industry.
评论 #21459853 未加载
评论 #21459746 未加载
评论 #21459758 未加载
评论 #21459857 未加载
评论 #21460975 未加载
评论 #21459685 未加载