Programming as Theory Building (1985)

193 pointsby onlurkingalmost 5 years ago

14 comments

optymizeralmost 5 years ago

The topic of how we as developers implement solutions in code has been on my mind for years.The one insightful idea I found in this essay is that coding is a lossy one-way operation, from which you cannot fully derive the original idea or the 'theory'. That seems similar to losing information when compiling source code, making it impossible to restore the exact source code from its machine code representation.So if we work backwards, it's: machine code (bits) -> source code (text) -> idea/solution (human thought?)Despite losing some information, machine and source code have interesting properties, such as being able to copy them easily, transpile to different format, etc.What I'd like to ask the HN brain is if anyone can think of another way to express a higher level thought other than language? In his essay, Naur implies that there is no such thing. I wonder if we had made any progress on that front in the 35 years that have elapsed since this essay was written.The only thing I can think of is something like UML, which has tons of diagram types for structural and behavioral properties of a system, but I've always found it hard to 'see' the real idea they're trying to describe, in the same way how I find it hard to imagine a 4D object by looking at its 3D projections. With enough effort its certainly doable, but I wouldn't say the process is intuitive or easy, so to me, diagrams are like projections of an idea from different points of view, but how do we encode the idea/thought/theory itself?What is it about language and apprenticeship that makes conveying ideas or theories possible? I view this process as an inefficient way of serializing an idea and transmitting it over voice to another person, who has to unserialize the sounds, convert them to words, then they have to create the associations in their brain based on the meaning of those words, and then probe into the correctness of the associations by asking clarifying questions.Is this really the best we can do in 2020? How are other fields conveying complex abstract notions and ideas?

评论 #23375675 未加载

评论 #23375668 未加载

评论 #23375743 未加载

评论 #23376609 未加载

评论 #23376487 未加载

评论 #23376965 未加载

评论 #23377843 未加载

评论 #23375764 未加载

评论 #23386109 未加载

评论 #23379457 未加载

评论 #23375768 未加载

onlurkingalmost 5 years ago

OP here, much of this paper resonates around the knowledge we need to build a system, lot of this is like the understanding of business rules, is the context we need to build a working software in the first place.The problem is that knowledge is mostly "tacit", and tends to grow as the software evolves. For example, several development tasks are normally completed not only based on the documented user stories, but they also carry the context from meetings or discussions that aren't documented.When you lose the original authors of the program, it becomes very difficult to rebuild the necessary context to understand how the system works - tasks like adding new features or modifying existing behavior becomes very hard. Also in the "The Metaphor as a Theory" part, much of the work is a shared knowledge between the developers, when you have several programmers working in parallel as fast as they can, the design of the program can become highly incoherent.Nowdays we have practices like testing which could be a really helpful companion when it comes to understanding how the system works and the expected behavior of it's parts, which can be treated as a documentation, also we have code reviews that can guarantee that any addition to the system is consistent according to the system's design if is done right.But still, this dependency of the context it's a very hard problem to resolve.

评论 #23382055 未加载

at_a_removealmost 5 years ago

"If you lose the people, you lose the program."I had a protracted, bitter struggle with this at one job. We had a business process, really in the top two of business processes, that was neglected as it was critical. When I was new to that role, I said, "We should rewrite this. This is scattered across multiple servers, the code has almost no comments (in the places where we still had source code), and more importantly, people were leaving."One of the original programmers had died. People responsible for the why of certain decisions were retiring or leaving. And so on and so forth. I used to joke that our process was documented in C, except for the places it was bash, or borrowed Powershell, or ... Like an evolved process, instead of problems being solved, later systems were added to correct issues, only epicycles, even as the bus factor continued to decrement every so often.I still have a low level of sour antipathy when I think of it, that my efforts to "do the right thing" came to nothing.It's a shame. When I had a freer hand to work as I liked, systems I built could detect, in a limited fashion, when the world had changed, that is, if the theory of the world was wrong. If the vendor changed some critical portion of the database, the program would identify the new column or the missing table, then loudly expire after issuing its complaint.This understanding of the slice of the world a program must interact with is so critical and worse yet, fragile, subject to both breakage and decay.

bonestormiialmost 5 years ago

This concept evokes many familiar memories of reading other people's code, which is always extremely hard. I feel multiple ways about this concept. On the one hand, I believe programs can be sectionally reduced to a inputs, outputs, and a sequence of states in between, and a programmer can understand those things well enough to extend an existing programmer competently in many cases. It must be true, because it does happen.On the other hand, while a programmer can learn from source and documentation the wheres, whens, and whats of the program, there is always the remaining question of "Why?", which is central to this discussion. Here, I think good high-level examples of usage tend to do a good job of covering the inputs and outputs. But with regard to all of the intermediary states of the program... there is too much detail there to really document it. Those details evolve as an evolution more than a design. There is code added, then replaced or omitted entirely. Things are designed which work, but then are restructured for performance, organization, or to eliminate repetition. In these cases, there is information that is manifested in the absence of code, and the second rendering of the code better captures its function, but obscures its evolutionary history.Here's something I've been consuming lately: <a href="https://www.youtube.com/watch?v=wbpMiKiSKm8" rel="nofollow">https://www.youtube.com/watch?v=wbpMiKiSKm8</a>This game programmer (Sebastian Lague, who is excellent, by the way) walks through the development of procedural terrain generation in Unity. What's fascinating to me is the way he does it does this really effective job of "theory building". Things are implemented; results are observed; some code is deleted altogether that was only ever present to allow building up to that illustration, but will no longer be necessary at the next stage of evolution.This is the way programmers work. Information is lost. If you weren't there to experience it at inception, only a great imagination and testing can replace it--at which point, you may find yourself actually rewriting the code, using existing code as a reference.

评论 #23377407 未加载

alexashkaalmost 5 years ago

I think this could be applied to all spheres of human knowledge.We have books written in languages we don't understand. There has to be shared context beneath all forms of communication and communication itself can be seen as CRUD operations of the layered cake of shared contexts/stories/narratives/ideas that have dependencies between them.Study of CRUD operations and interactions between the layers and what the various types are and what common dependencies occur and when would be a very fruitful field of psychology, if not perhaps some AI modelling - I don't know if there already is a psychological model along these lines.For example when you haven't seen a friend for a long time and you reconnect, if your life experiences haven't updated or deleted large parts of your shared context, you'll fall right back into the groove.It does make me wonder if there needs to be an update of the shared context country or even world-wide to make people feel a sense of community again. We've done away with religion and the many community bonding experiences that it offered. We've tried the 'get rich or die tryin' context and I don't think it has been fulfilling for the majority of the beta testers :)Time to try something else, perhaps with a little more thought and rigor put into it?

fslothalmost 5 years ago

Peter Naur is one of the unsung(?) giants of software. His name should be instantly recognizable.This is perhaps the best paper ever to explain the nature of collaborative program development and maintenance.If I had a company I would make this mandatory reading for everybody - everybody, not just programmers.

评论 #23377004 未加载

azhualmost 5 years ago

> A program is a shared mental construct (he uses the word theory) that lives in the minds of the people who work on it.Absolutely. The plain English definition for the word "program" that Google shows is (noun) "a set of related measures or activities with a particular long-term aim", (verb) "arrange according to a plan or schedule".A software program fulfills a certain set of behaviors, serves a certain purpose, is a materialization of an idea, or otherwise is a transcription of something from a certain domain into the domain of software. The "source of truth" of what that something is is external from the program.The knowledge domain of a programmer is therefore not only both their code and the idea it represents but the mapping in between. Both ends are easily documentable (in the narrow context of their specific domains), and it feels like this could have led to a possible convolution of what it takes to be a good programmer.We have endless measures, philosophies, and codifications for what makes for good form when drawing back the bowstring, what release techniques make for the least disturbance to the arrow's path, what arrow shape makes for the most optimal flight, etc, but less for an archer's aiming technique. All we can do is just look to see if the target's been hit or not.How an org "aims" the "arrow" of code towards the business target is a higher level concern than how awesome the arrow shot is or how straight it flies. If you're not controlling how you aim you lack the context to fully qualify your assessment of how arrow choice, pull/release form, or even flight path affected your result.Codifying not only how your org builds product ideas and how it implements software, but how your org maps from one domain to the other helps mitigate knowledge siloing.

Ididntdothisalmost 5 years ago

This makes sense to me. I think it’s really important to be able to predict what the software will do under certain circumstances. You can do that only if you have a pretty good concept of the thinking behind the code. I usually get nervous when something doesn’t behave as expected because it indicates that there is a mismatch between the theory and its implementation.This would explain why a lot of corporate software isn’t good. There is no shared understanding and a lot of people make changes without understanding the big picture.

imprettycoolalmost 5 years ago

I only read the abstract (first paragraph), so maybe I'm way off base here. I'm about to go to bed and want to bang this out:I think it's three pieces that need to come together. The source, the system it's running on, and the user. You don't necessarily need all 31. If you only have the user and the source but not the system then you're screwed. I could print out the entire FreeBSD source and docs, go back in time to 1820 and it would be pretty much useless since I need a C compiler and a million transistors, power supply and a bunch of other stuff. Obviously this is an extreme example, since most of the time you'd just have a slightly incomplete system (e.g. crappy build scripts but you know they built it on a unix system 2 years ago) so it's usually workable2. If it's the user and the system then that's basically proprietary software. You can reverse engineer the source. Tedious but doable3. If it's the source and the system, then you might be able to get a new user study both and understand everything again. Depends on the complexity of the source/system and the docs.I think of it as an organism, like it can be damaged and heal itself. There is redundancy between these 3 axes. Depending the circumstances, you can heal it or it might be permanently damaged

igraviousalmost 5 years ago

Related discussions on HN:10 months ago: <a href="https://news.ycombinator.com/item?id=20487652" rel="nofollow">https://news.ycombinator.com/item?id=20487652</a>2 years ago: <a href="https://news.ycombinator.com/item?id=10833278" rel="nofollow">https://news.ycombinator.com/item?id=10833278</a>5 years ago: <a href="https://news.ycombinator.com/item?id=7491661" rel="nofollow">https://news.ycombinator.com/item?id=7491661</a>

akavelalmost 5 years ago

The basic notion of program as theory fits into what I personally stumbled upon recently on my own. Notably, expanding on it, I like to see every execution of a program as an Experiment - in that it may support or invalidate the Theory (by manifesting bugs/undesired behaviors). I'm happy to see I'm not the first one to think of this idea. However, I am not necessarily convinced by the main claim the article seems to make based on it, that the Theory cannot be resurrected from the code of the program + documentation. I think it may be very hard, and depend a lot on many factors (quality of code, docs, the resurrecting team, their time, and as suggested, access to the domain where the program is used), but it may still be possible to a huge extent. I believe some sentences used by the author actually provide hints in support of this claim. Also, in other sciences like math or physics, albeit not easily, knowledge/theory transfer through writing can be done, or at least helped significantly.

ximmalmost 5 years ago

> the primary aim of programming is to have the programmers build a theoryI don't agree with that statement, but I don't think the primary aim is to produce a program either.I believe the primary aim is to enable users to use a program. For that they need a mental model. Maintaining a consistent and simple theory among developers is a means to that end.

p4bl0almost 5 years ago

Very interesting article. Thanks for sharing. I think it explains very well why software developed by large IT companies¹ that puts developers after developers for a few months on their clients projects are systematically very bad, to stay polite.[1] I refer to what are called SSII (société de services en ingénierie informatique) in France.

mbrodersenalmost 5 years ago

The primary aim of programming is to solve problems. The moment you forget that one simple fact you are already heading in the wrong direction.

评论 #23378750 未加载