Out of the Tar Pit (2006) [pdf]

119 pointsby n0wabout 2 years ago

11 comments

dcreabout 2 years ago

This paper was very influential on me when I first started programming professionally around 2012. I don't plan on reading it again, but my vague memory of what I got out of it is pretty simple and I think has become pretty standard practice at this point: avoid mutable state and use pure functions where possible.The framing of accidental and essential complexity is of course very useful and not really unique to this paper. The difficulty is there is nothing but experience-informed judgment that can tell you which complexity is accidental and which is essential. There is always a framing of the problem that can justify some bit of complexity as essential. You have to judge.

评论 #34971617 未加载

评论 #34976983 未加载

评论 #34977598 未加载

评论 #34977577 未加载

dagssabout 2 years ago

I feel event sourcing is a real world pragmatic approach to declarative programming that this paper advocates.For state changes you add events to the database to describe something that happened. Any question you may need an answer for / business decision you want to make can be answered by querying the events.The problem at the moment is that while event sourcing is excellent at reducing accidental complexity surrounding implementing business rules, there is little standard / commonly used tooling around it and you end up with lots of accidental complexity in that end.An example would be a database not designed to be a CRUD store but to store events and manage read models, and manage computation of projections etc -- while being suitable for OLTP workloads. At a minimum, very strong support for using any kind of construct in materialized views (since in a sense the entire business logic is written as a "materialized view" when doing event sourcing)

评论 #34972535 未加载

Verdexabout 2 years ago

In the past, I have been unimpressed by this paper. Perhaps someone can shed some historical context...But from my perspective what happens is that the paper defines complexity in exactly the way that allows them to deride OO, FP, etc programming whilst simultaneously showing how awesome functional relational programming is. It ignores complexity that's orthogonal to what FRP addresses and ignores areas in which FRP itself contributes to unnecessary complexity.It feels like a scenario where the authors had something that they thought was neat and went out to create metrics that would in fact show that it was neat. Maybe FRP is really neat, but I feel that the paper itself doesn't contribute to anything because its logic is so custom and purpose built.

评论 #34971110 未加载

ZitchDogabout 2 years ago

A classic paper with a dream that has yet to be realized. We continue to bolt state on top of state. Redux bolted on top of GraphQL on top of Redis on top of Postgres. We can do better.

评论 #34973743 未加载

cloogshicerabout 2 years ago

I think the crux of this paper is section 7.2.2:> There is one final practical problem that we want to consider — even though we believe it is fairly rare in most application domains. In section 7.1.1 we argued that immutable, derived data would correspond to accidental state and could be omitted (because the logic of the system could always be used to derive the data on-demand). Whilst this is true, there are occasionally situations where the ideal world approach (of having no accidental state, and using on-demand derivation) does not give rise to the most natural modelling of the problem. One possible situation of this kind is for derived data which is dependent upon both a whole series of user inputs over time, and its own previous values. In such cases it can be advantageous to maintain the accidental state even in the ideal world. An example of this would be the derived data representing the position state of a computer-controlled opponent in an interactive game — it is at all times derivable by a function of both all prior user movements and the initial starting positions, but this is not the way it is most naturally expressed.Emphasis is mine.I think that this type of derived data, which I put in italics above, is quite common - contrary to what the authors of the paper argue. Any UI code or game-like system, like simulations, will have this kind of data. And the paper does not have a good answer for it. I honestly think that nobody has an answer for it, and it's why most of our UIs suck.I would love to see something that makes handling this type of derived data easy.

评论 #34976215 未加载

n0wabout 2 years ago

I came across this after seeing relic[0] submitted the other day and thought it was pretty interesting.I've been into CRDTs for a while and have started wondering about generic mechanisms for distributed data. This lead me to read a lot more about the Relational Model of data and eventually to the Event Calculus.What's interesting to me is that these things end up feeling a lot like CRDTs[1] or Event Sourcing. I haven't quite finished pulling on these threads but the relic link was a timely read considering!I really liked the first half of this paper and the Authors categorization of complexity. However the second half fell a bit short for me. It seems they made the same mistake as many other people (SQL != Relational) and their idea of Feeders and Observers seems a bit more like an escape hatch than an elegant method for interfacing with the outside world.[0] <a href="https://github.com/wotbrew/relic">https://github.com/wotbrew/relic</a> [1] <a href="http://archagon.net/blog/2018/03/24/data-laced-with-history/" rel="nofollow">http://archagon.net/blog/2018/03/24/data-laced-with-history/</a>

评论 #34975166 未加载

carapaceabout 2 years ago

I'm designing a such a system now, based on the pure functional Joy language with two additional data stores: a relational db system (Prolog (or maybe Datalog), not SQL) and what is effectively a git repo although I think of it as a "data oracle".The role of the relational db system is explained in TFA. (Prolog makes a fine Relational Model DB (the "relations" in RMDBs are the same logical relations that Prolog "relations", um, are.) The language is cleaner and simpler and more powerful than stock SQL, there's an ISO standard and several solid implementations, and you can always back it up with SQLite or PostGRES or whatever if you need to.) The trick to integrating it with a purely functional system is to only use "pure and monotonic Prolog code" which you want to do anyway ( <a href="https://www.metalevel.at/prolog/debugging" rel="nofollow">https://www.metalevel.at/prolog/debugging</a> ) or as I like to say, "Don't put lies in your database."The "data oracle" (which again is more-or-less just a git repo) provides bytes given a three-tuple of (hash, offset, length). These are immutable, so you can cache the results of (pure) computations over them (e.g. a predicate like "is valid UTF-8" is true/false for all time, yeah?) This replaces the filesystem.I was working with Prof. Wirth's Oberon RISC CPU as a basis, but a couple of days ago a fantastic new 64-bit vm went by here on HN and I'm going to use that going forward. <a href="https://github.com/maximecb/uvm">https://github.com/maximecb/uvm</a> <a href="https://news.ycombinator.com/item?id=34936729" rel="nofollow">https://news.ycombinator.com/item?id=34936729</a>

bob1029about 2 years ago

This paper was career changing for me.Chapter 9 is effectively the original concept for our business rules engine today. We use SQLite and C# UDFs as the actual foundation.Using a proper relational model and SQL queries for all the things means that domain experts can directly contribute. It also makes it feasible to turn your domain experts into internal customers of your developers. For B2B products, this can be make or break.Building an actual business around this kind of thing is very hazardous in my experience. Unless you are completely certain that you have the schema figured out, this promising foundation converts to quicksand.Using higher normal forms is one way to reduce the consequence of screwing up domain modeling, but you still really have to know the relational guts of the business one way or another.One design-time trick is to get all stakeholders to compose their ideal schema in something like excel and have them fill in some sample data. Showing them an example of one of these for a different domain is a powerful lesson in my experience.

feorenabout 2 years ago

My impression from reading this has always been "Almost there! Keep going." If you keep pulling on these threads, I claim that you reach some inevitable conclusions. These ideas are a superset of all the other advice they give (and most other advice you hear, too):1. Separate identity from data. These are completely different things. The number 7 is not mutable, even though my age is. Keep going! The phrase "It was a dark and stormy night" is not mutable, even though the opening line of your book is. Keep going! The statement "there are seventeen cars parked on level 3 and four spots available" is not mutable, even though the status of the parking garage is. Identity is immutable, and statements (data) are also immutable. Only the assignment of a statement to an identity is mutable.2. Think about the operations that make sense on your data. Sure, you have "map", "filter", "reduce" that are mathematically pure. What about "offset" or "rotate"? Those have identity, they are associative, individually commutative (but not with each other!). Is there a distributive property that applies there? Okay, what about "buy" and "sell" operating on commodities? Are those mathematical operations? Do they have an identity element? Are they associative and commutative? "Buy 2000 lbs of chicken" -- is that equivalent to "Buy 1 ton of chicken"? What are its domain and range? Not "chicken", nor even "warehouses" -- you're not teleporting the chicken as soon as you commit that record. More like "contract", or even "contract request". What can you do with contracts? Is there an "identity contract"? "Buy 0 chicken"? Zero is a useful number! Is "Buy 0 chicken" the same as "Buy 0 beef"? Explore these questions. Find the math around your domain. Functional purity is all good, but it's wasted if your domain is just "Chicken : Meat : Commodity", "Beef : Meat : Commodity", "Pine : Lumber : Commodity". Wrong. Don't think about nouns. Think about sensible pure operations in your domain. Successive abstraction layers should be restricting what's possible to do with your data, by defining higher and higher-level operations. If your abstraction layers aren't restricting what's possible to do, they're an inner platform and you don't need them.3. Don't make big decisions upfront. That's how you add accidental complexity. Don't make any decisions you don't need to, and prefer solutions that allow you to defer or lighten decisions. If you follow this thread, that means you're using a relational model. They absolutely got that right (section 8). Otherwise you're making decisions about ownership that you don't need to be making. Domain Driven Design has it wrong here with aggregate roots. What's an aggregate? Which concept goes inside of which aggregate? You do not need to make that decision. The authors of TFA get it right, and then wrong again, when they try to apply a relational model to shitty noun-based domain ideas like "Give Employee objects a reference to their Department, vs. Give Department objects a set (or array) of references to their Employees vs. Both of the above". No. They're not following their own advice. Can an employee exist without the concept of "department"? Yes. Can a department exist without the concept of employees? Probably. Therefore you must encode the relationship separately from both. Your hands are tied. If concept A can exist independently of concept B, you must not draw an arrow from A to B. The answer is not "both", it's "neither". An employee's assignment within a department is its own independent concept, that knows about "employee" and "department" both. And now it's obvious you can give this concept its own tenure and add a new record whenever they change departments. You've found append-only event sourcing without needing to start there -- it came from first principles (in this case).4. Operate on the broadest scope you can. Manipulate sets of things, not individual things. This is half of functional programming's usefulness right here. This is supported by using a relational model. Operate on sequences of things, not individual things: there's reactive programming. How else can you broaden the scope of what you're operating on?5. Don't just do something, stand there! That is: don't do, plan. Instead of immediately looping, just use "map". Instead of immediately hitting the database, write a query plan. This doesn't mean offload your thinking to GraphQL -- that's just kicking the can down the road. See point #2. This is the other half of functional programming's usefulness. Do only as much as you need to make (or append to) a plan, and then stop. Someone else will execute it later -- maybe! You don't care.There's probably more, but there are more fundamental ideas than "use functional programming" or "use event sourcing" or "use SOLID" -- first principles that actually lead to almost all the other good advice you get. This paper kinda almost gets there a couple times, then walks back again. Keep going. Keep pulling on those threads. I suspect that the more you do this, the more your code will change for the better. But you have to be willing to keep going -- don't half-ass it. If you only go part way and then stop, the benefits won't be apparent yet.

vendiddyabout 2 years ago

There are a lot of comments saying that we can't avoid mutable state and dismiss this paper entirely.I find a practical interpretation of this paper is:- Favor pure functions over impure functions- Reduce mutable state and be deliberate about where those mutations have to happen- Prefer derived state over keeping state in syncThere are always exceptions. Use your judgment on when this simplifies code and when it doesn't.

coffeebeqnabout 2 years ago

Does anyone have this in audio form? I dont have a ergonomic way to read this even though it looks really on point