> The main reason we can not use system clocks is that system clocks across servers are not guaranteed to be synchronized.<p>Sentences like this will make me never regret to moving my infrastructure to bare-metal. My clocks are synchronized down to several nano-seconds, with leap-second skew and all kinds of shiny things. It literally took a day to set up and a blessing from an ISP in the same datacenter to use their clock sources (GPS + PTP). All the other servers are synchronized to that one via Chrony.
Can someone explain why, in an interview context, someone with the technical ability to understand and assess and write and communicate a deep set of domain-specific knowledge like this....might still be asked to do some in-person leetcode tests? How does on the fly recursive algo regurgitation sometimes mean more than being able to demonstrate such depth of knowledge?
Distributed computing > Theoretical foundations: <a href="https://en.wikipedia.org/wiki/Distributed_computing#Theoretical_foundations" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Distributed_computing#Theoreti...</a><p>Distributed algorithm > Standard problems: <a href="https://en.wikipedia.org/wiki/Distributed_algorithm#Standard_problems" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Distributed_algorithm#Standard...</a><p>Notes from "Ask HN: Learning about distributed systems?" <a href="https://news.ycombinator.com/item?id=23932271">https://news.ycombinator.com/item?id=23932271</a> ; CAP(?), BSP, Paxos, Raft, Byzantine fault, Consensus (computer science), Category: Distributed computing<p>"Ask HN: Do you use TLA+?" (2022)
<a href="https://news.ycombinator.com/item?id=30194993">https://news.ycombinator.com/item?id=30194993</a> :<p>> <i>"Concurrency: The Works of Leslie Lamport" ( <a href="https://g.co/kgs/nx1BaB" rel="nofollow noreferrer">https://g.co/kgs/nx1BaB</a></i> )<p>Lamport timestamp > Lamport's logical clock in distributed systems: <a href="https://en.wikipedia.org/wiki/Lamport_timestamp#Lamport's_logical_clock_in_distributed_systems" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Lamport_timestamp#Lamport's_lo...</a> :<p>> <i>In a distributed system, it is not possible in practice to synchronize time across entities (typically thought of as processes) within the system; hence, the entities can use the concept of a logical clock based on the events through which they communicate.</i><p>Vector clock: <a href="https://en.wikipedia.org/wiki/Vector_clock" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Vector_clock</a><p>> <i><a href="https://westurner.github.io/hnlog/#comment-27442819" rel="nofollow noreferrer">https://westurner.github.io/hnlog/#comment-27442819</a> :</i><p>>> <i>Can there still be side channel attacks in formally verified systems? Can e.g. TLA+ help with that at all?</i>
The book is available for pre-order, and shipping in September: <a href="https://www.amazon.com/dp/0138221987/" rel="nofollow noreferrer">https://www.amazon.com/dp/0138221987/</a>
I know the author of the blog and the book personally. And the author has built small storage engines himself and grokked code bases of Cassandra and other DBs to understand the patterns in code and not just as theoretical concepts. The blogs has code excerpts as well. Highly recommended read for the hands on folks.
Commenters here are missing the point — the intent is to build otherwise isolated systems with properties that are very difficult to control, such as varying amounts of clock skew, arbitrary process pauses due to GC cycles or CPU consumption and build a system on top that allows for the storage of mutable state. An example would be a cluster of etcd or dqlite instances (which Kubernetes in multi-master setups also use BTW), or at a larger scale, something like DynamoDB.<p>It’s one of the more easily approached resources on the design of distributed systems, and a good read.
<a href="https://microservices.io/patterns/index.html" rel="nofollow noreferrer">https://microservices.io/patterns/index.html</a> has some patterns as well that aren't necessarily specific to a microservice architecture.
> The main reason we can not use system clocks is that system clocks across servers are not guaranteed to be synchronized.<p>Data center/cloud system clocks can be tightly synchronized now in practice. Still never perfect and race conditions abound.<p>But that doesn't mean you can't rely on a clock to determine ordering, Google popularized a different approach with TrueTime/Spanner: <a href="https://cloud.google.com/spanner/docs/true-time-external-consistency" rel="nofollow noreferrer">https://cloud.google.com/spanner/docs/true-time-external-con...</a>
There's a book that compiles all these patterns:<p><a href="https://learning.oreilly.com/library/view/patterns-of-distributed/9780138222246/cover.html" rel="nofollow noreferrer">https://learning.oreilly.com/library/view/patterns-of-distri...</a>
Related:<p><i>Patterns of Distributed Systems (2020)</i> - <a href="https://news.ycombinator.com/item?id=26089683">https://news.ycombinator.com/item?id=26089683</a> - Feb 2021 (58 comments)