I’ve only read about Service Mesh, my impression was that it seems to add an awful lot of processes and complexity just to make developer’s lives slightly easier.<p>Maybe I’m wrong but it almost feels like busy work for DevOps. Is my first impression wrong? Is this the right way to architect systems in some use cases, and if so what are they?
I just wrote something extremely similar, but it's only internal right now.<p>I personally find that the service mesh value-prop is hard to justify for a serverless stack (mostly Cloud Run, but AWS Lambda too probably), and in situations where your services are mostly all in the same language and you can bake the features into libraries that are much easier to import.<p>Observability is a great example of this. In serverless-land, you're already getting the standard HTTP metrics (ex request count, response codes, latency, etc), tracing, and standard HTTP request logging "for free."
Now imagine you have something that has the complexity and change volume of a distributed control plane bringing together load-balancing, service advertisement, public key infrastructure, and software defined networking, and then try to imagine running it at the same reliability as your DNS.<p>Also: proxies, proxies everywhere, as far as the eye can see.
Thanks for this.<p>I have never deployed a service mesh or used one but I am designing something similar at the code layer. It is designed to route between server components. That is, at the architecture between threads in a multithreaded system.<p>The problem I want to solve is that I want architecture to be trivially easy to change with minimal <i>code</i> changes. This is the promise and allure of enterprise service buses and messaging queues and probably Spring.<p>I have managed RabbitMQ and I didn't enjoy it.<p>If I want a system that can scale up and down and that multiples of any system object can be introduced or removed without drastic rewrites.<p>I would like to decouple bottleneck from code and turn it into runtime configuration.<p>My understanding of things such as Traefik and istio is that they are frustrating to set up.<p>Specifically I am working on designing interthread communication patterns for multithreaded software.<p>How do you design an architecture that is easy to change, scales and is flexible?<p>I am thinking of a message routing definition format that is extremely flexible and allows any topology to be created.<p><a href="https://github.com/samsquire/ideas4#526-multiplexing-setting-format-and-routing-operation">https://github.com/samsquire/ideas4#526-multiplexing-setting...</a><p>I think there is application of the same pattern to the network layer too.<p>Each communication event has associated with it an environment of keyvalues that look similar to this:<p><pre><code> petsserver1
container1
thread3
socket5
user563
ingestionthread1
</code></pre>
These can be used to route to keyspace ranges (such as particular users to tenant shards or load balance) to other components. For example users1-1000 are handled by petsserver1 and socket5 is associated with thread3.<p>In other words: changing the RabbitMQ routing settings doesn't change the architecture of your software. You need to change the architecture of the software to match the routing configuration. But what if you changed the routing configuration and the application architecture changed to match?
Service meshes make it easier to roll out advanced load management/reliability features such as prioritized load shedding, which would otherwise need to be implemented within each language/framework.<p>For instance, Aperture[0] open-source flow control system is built on service meshes.<p>[0]: <a href="https://github.com/fluxninja/aperture">https://github.com/fluxninja/aperture</a><p>[1]: <a href="https://docs.fluxninja.com" rel="nofollow">https://docs.fluxninja.com</a>