Ask HN: Is a single source of truth not just a single point of failure?

27 pointsby LecroJSover 2 years ago

I got AWS certified this week, and as I’ve been learning more about what goes into the design of various systems, it occurs to me that I don’t have an understanding of when a unique part of a system should be considered a pro or con.For instance, it makes sense to me that we would want something like the database for an application to be deployed in multiple AZ’s to increase availability via no single point of failure.It _also_ makes sense to me that if I’m using redux in a frontend web app, one benefit we get is the single source of truth around app state.These two make sense to me independently, but I am struggling to understand how the evaluation around this differs depending on the context. My gut is telling me that I’m trying to compare two different levels of abstraction, but I just can’t quite come up with a rule for when one of these rules applies and the other one does not.What questions should I be asking myself to determine which of these is appropriate to apply? Thanks

13 comments

jedbergover 2 years ago

Think of it in terms of CAP (consistency (eventually), availability, and partitions). CAP applies to any data in your system.Most people think of CAP with distributed data stores, where you have to worry about when you write to the datastore when will that be available everywhere and how long will be out of sync and what network conditions will break that.But it really applies to anything. With Redux, the state is stored in the browser. Is it consistent? Well, it's the only place the data is stored, so yes. Is it available? From the user's perspective, yes. It's always available to them as long as their device is available to them. What happens when there is a partition? Well, they can't get to your app at all, so the state won't change. But that's fine because they aren't worrying about that if they can't get to your app.So for redux, it is consistent and available but is not partition resistant, which is just fine for that use case.For the database that the app is based on, it's a different calculus. Having a single data store would be bad because while it would be consistent and partition resistant, it wouldn't be available at all times. Chances are your use case would prefer available as the most important aspect for a web app, and then you have to trade off consistent or partition resistant depending on the use case.

lolinderover 2 years ago

The biggest question to ask yourself is what are the failure modes you're worried about?Redux isn't going to suddenly go offline, because it's running in the same process as your UI code. Redundancy therefore buys you nothing. The important failure modes in single-process UI code are all logic errors, and the best defense against those is to reduce the amount of unnecessary logic: so use a single source of truth.With a database, the failure modes are very different. You're using a battle-hardened database whose logical consistency you have every reason to trust. The biggest failure modes are driven by the fact that you're running on a network, and networks are flaky. The best mitigation against these failures is to run redundant copies and trust that the database authors got their syncing code right.

inphovoreover 2 years ago

There are some interesting and relevant responses here, yet none of them clearly address the specific question you’ve raised.How are single point designs more advantageous in some contexts, and redundancies more advantageous in others.The most meaningful difference in these circumstance are regarding coherent design and delegation.By coherent design redux is where you store your information to be most immediately avail for use. Consider having paperwork for your busy business. You have one place where that paperwork goes so it stays with other like things and you can easily find it when you need it.Delegation however is dependent upon factors outside of your immediate environment. You may want to delegate a redundant backup of your paperwork offsite, yet you would still only have one place to look for it in your office (lest you accidentally maintain multiple divergent copies of the same files.)When delegating, you may well have a primary, yet there is a risk/overhead completely consideration regarding how many redundant delegates one may rely upon (as many as you can afford/keep consistent.)Notice, even when delegating, you will still want one coherent WAY you delegate. Having each piece of business logic decide how to fetch data remotely vs using one library (redux) to organize these requests.Ask yourself, what does this simplify? Am I delegating control of a valuable asset? What happens when that part cannot be relied upon?In this circumstance, you are considering in-app business logic organization vs network topology.

thomascgalvinover 2 years ago

Single points of failure are weaknesses against events outside of your control.Redundancy protects against single points of failure, so redundancy only makes sense when you're trying to protect against something outside of your control.Your database might go away for many different reasons. There could be a network error that prevents you from reaching it. A power outage could take the data center down. A hardware failure could make the server crash.But if you have multiple instances of your database, you can protect against that. If "the platform" (AWS, Azure, whatever) detects that one instance of your database is unreachable, for whatever reason, it can invisibly start pointing your app to a different instance. Redundancy protects you from these outages.But, if Redux goes down, there's really only one reason: there's a logic error in your code. You can't protect against this with redundancy; you can only protect against it by finding and fixing errors. If the Redux layer crashes every time you send it the string "hello.world.png" having multiple instances of Redux won't help; you need to find the logic error and fix it.Also, your app should (and probably does) treat the database a single source of truth, even if you have multiple instances of it running "for real". To give a basic example, you might have one Leader database, where all of the writes happen "for real", and a bunch of Follower databases, which subscribe to the Leader and copy whatever it does. Then, if the Leader goes down, one of the Followers becomes the new Leader, and the new, single source of truth.But your app doesn't need to know about any of that; the platform takes care of it invisibly. You point your app to "db.aws.com/?db=12345" (or whatever), and AWS automatically redirects you to the Leader. So with this scheme, you get the benefits of both single source of truth and redundancy.

mikstover 2 years ago

First, everything is a tradeoff. Do not expect to have a cake and eat it too.Secondly, thiese are indeed two different levels. A database that serves as a single source of truth can be implemented as an entire cluster of serves with no single point of failure.. At least in theory ;)

mrkeenover 2 years ago

A distributed database shouldn't violate SSOT because the different nodes will agree on the truth (which came from a single source: somewhere upstream, like a git repo). If they don't agree, something has gone really wrong (split brain scenario?)

评论 #33308194 未加载

kthejoker2over 2 years ago

"Single source of truth" is usually seen in data warehousing /BI world, where you have multiple source systems (say, a CRM and a ordering system) or reporting systems and you want them to agree when yous ask, "how many repeat customers did we have last month?"On the app dev side, where you frequently have more writes than reads, and transactional guarantees (e.g. not allowing a bank transfer if there's no money in the account) ... what you want is eventual consistency, or some form of pessimism built in to your state maangement.So not really "single source of truth" so much as "single way of reconciling the consequences of distributed event processing."

Comeviusover 2 years ago

They are not independent, if a partition occurs you can give up consistency for availability. For example if you have a database server and it becomes unavailable, the best you can do is eventual consistency, and if you do you no longer have a single source of truth, until eventually your causality protocol reestablishes it.For example the client can decide to give up availability if the server is unavailable, or give up consistency by accepting writes, which later have to be merged back somehow handling any conflict that wouldn't occur otherwise.<a href="https://en.wikipedia.org/wiki/PACELC_theorem" rel="nofollow">https://en.wikipedia.org/wiki/PACELC_theorem</a>

azurelakeover 2 years ago

The concept you're looking for is linearizability. Basically, it's the idea of being able to pretend that your distributed system is a single node with a single source of truth. An example system that has this property is etcd: <a href="https://jepsen.io/analyses/etcd-3.4.3" rel="nofollow">https://jepsen.io/analyses/etcd-3.4.3</a>If you want to learn more about this, take a look at Designing Data-intensive Applications from Martin Kleppmann.

valandover 2 years ago

The difference is that database is an external entity. The physical representation of the database's state is not owned by your app. On the other hand your redux instance is owned by your client side app. It lives and dies with your app.At least that is how your user expected your app to be.There are no strict rule over your system design other than "whatever makes sense the most"

aristofunover 2 years ago

Your whole frontend js app is always a single point of failure for each individual user anyway.But within the app considering it works as expected, you can speculate about where you keep your state.No need to play words and general abstract concepts, just focus on the specific problem you’re solving.

somatover 2 years ago

Sometimes you have to put all your eggs in one basket.. Then protect that basket really well.

taylodlover 2 years ago

That’s why you want a single service of truth. Abstract the consumer from the source and allow yourself to scale the physical source of truth without impacting the consumer.