It is an interesting observation. I expect it is the difference in expertise though. The author compares reading code snippets in Dr. Dobbs (which dates them probably 10 or 15 years ago) to reading stories.<p>Here is the thing, everything you read in a story is supposed to convey imagery of things like what you may have already experienced, they are already internalized, you "see" them when you read as if you were there.<p>Allow me to use another popular space as an example, music. When you are first reading sheet music, you see notes on a stave, key signatures, different shapes representing different durations. At first you mechanically take that understanding and laboriously turn it into actions on your instrument. But after a while, if you do it enough, the shapes become recognizable as rhythms, the tones in the staves become tones not symbols, and then you stop "reading" music, you look at it and you can hear what it will sound like. And by that time you can make your instrument do what ever you hear.<p>Coding is not entirely different, at some point you don't see syntax, you see algorithm, you see inter-relationships of data structures, you see flow. After a number of years of coding I got to the point where I could see what code was doing pretty easily (except for obfuscated code which is always jarring on first look). I stop seeing code loops and start seeing iterative processing, if statements are branches on a path.<p>Anything in words or symbols, is code for something else. Whether its a murder mystery, a symphony, or a sorting algorithm the words and symbols are there to express the idea inside your head you can understand it, I think it is all reading though :-)
I've pretty much come to the conclusion that you don't understand code by reading it. Understanding someone else's code is almost always a reverse engineering exercise. It's often necessary to actually run the code repeatedly to understand it. This should really come as no surprise. It's likely that the guy that wrote the code didn't write it all at once, but rather wrote it in increments testing along the way. If he couldn't write it without testing every little bit, it's unlikely that you can read it without doing the same.
And this is one of the main reasons I massively prefer type annotations, statically analyzable languages and IDEs over dynamically typed languages, very loose languages and plain text editors.<p>Not being able to see what types code deals in, jump to definition, and find usages makes me feel crippled when exploring a new codebase.<p>One of my wished for programmers everywhere is that tools like Github and BitBucket start analyzing projects and letting you navigate better. I think that could save thousands of engineer hours.<p>This topic alone is a huge reason I started using Dart and eventually joined the team. Trying to figure out how a very large JavaScript codebase works is so incredibly painful, doing so in Dart is incredibly easy. This is also where very reflective libraries like Guice go wrong, and why, in my opinion, meta-programming should be used very carefully and sparingly.
One of my ideas I've been kicking around for the last year or two has been to write a book on learning to read code properly. I would submit that one of the greatest weaknesses of your typical programmer is that they don't know how to read other peoples' code.<p>On another note, being able to read other peoples' code is one of the strengths of Haskell. The Functor/Monad/Monoid/etc stuff becomes a way to know, based on a common vernacular, exactly what kind of interface is being exposed and what sort of data structures you're working with.
I've read a lot of code (both source and otherwise) and found that in many cases one of the biggest barriers to understanding the system as a whole is actually abstraction/indirection -- or more precisely, the often-excessive use of such. Following execution across multi-level-deep call chains that span many different files in different subdirectories feels almost like obfuscation, and all the pieces that are required to understand what happens in a particular case (important when e.g. looking for a bug) are scattered thinly over the whole system such that it takes significant effort to collect them all together.<p>In my experience, the majority of existing codebases I've worked with tends to be this way, although there are exceptions where everything is so simple and straightforwardly written that reading them is almost an enlightening experience.
Program comprehension seems like a fascinating research field if you enjoy figuring out why we do the things we do as software developers. It seems that we do indeed explore code, but we do so in somewhat systematic and predictable ways, which in turn might give us ideas about how to <i>write</i> our code to be more readable.<p>If anyone is interested, I suggest the publications of Anneliese von Mayrhauser and A. Marie Vans from the mid-90s as a possible starting point. They did a lot of work to reconcile earlier theories and paint a more unified big picture. Václav Rajlich is another name to search for, with several interesting publications in the early 2000s.
This is related to something that I pondered about yesterday.
In main stream languages, imperative programs require you to read the whole thing, while functional programs allow you to look at a function and derive its meaning only from its implementation: there can't be another influence on the outcome, as there are no side effects and types are -- in ML/Haskell family members -- precise. So referential transparency, in a sense helps you to read code depth-first, only looking at the relevant bits now.<p>As soon as you branch off from basic type systems and add in, for instance, subtyping or type classes or existential quantification, this ability is getting weakened. Now, in order to understand what a piece of code does, you need to understand some context: 'which implementation is it?', 'what do the possible implementations have in common?'. The effects of this small complexities add up until there's too much information to be kept in biological memory at the same time, the oldest thunk of information gets purged.<p>I do think that typed and pure languages have big advantages here: the information that I need is available immediately from looking at the type of the reference to it. If the types aren't funky, I can assume that it will terminate, throw no exception, not suffer from data races -- I can exclusively think about <i>what</i> it does, not how (btw, I'm not having one specific reference language in mind right now, I'm just thinking about what would be possible).
Here's a challenge for somebody: create a code review/pull request tool that helps you really <i>understand</i> the code changes. Some IDEs do an ok job of this for static code (IntelliJ for Java, Emacs haskell-mode), but I've never seen a good tool for giving insight into how a diff is changing a program at the structural level.
Part of my difficulty reading large code bases is there's often no particularly good entry point to start from. Towards the end of my time in OO languages I was struck by Reenskaug and Coplien's DCI [1] - many people just don't model a system's use cases as first class concepts. The most important part of a system - what it actually does! - doesn't get explicitly mentioned in a lot of people's code.<p>[1] <a href="http://en.wikipedia.org/wiki/Data,_context_and_interaction" rel="nofollow">http://en.wikipedia.org/wiki/Data,_context_and_interaction</a>
I rewrite code to explore and understand it. This actually started as a bad habit - "Oh, I can't believe this person did X, I'm going to change that to Y". Do that enough and you quickly figure out exactly <i>why</i> they did X, and they were either smarter than you or much more familiar with the problem domain. But now you know a little more, you have a little more respect for the code, a little more humility, etc..<p>Nowadays I take it as a given that I'm probably wrong, but I start rewriting anyway. Worst case (and most common case) I have to toss the code. But I learn. Plus there's a different place your brain goes when you feel like you control the code vs. looking at it behind glass.
I'm baffled by the absence of tools for diagram generation from code and things like contextual highlighting in, well, every code editor. Even code folding seems pretty primitive. Bret victor & the LightTable team are proposing some significant innovations, but most IDEs make me feel like I'm trying to explore a room through a keyhole.
I find it surprising that in 2014, most of the tools we use to read code treat it mostly like any other text. With the exception of some IDEs for some languages (e.g., Eclipse and Java), very few apps for browsing code actually understand its structure, i.e., the hierarchy of symbols and namespaces and the implicit graph defined by module imports, function calls, type references, etc. We're relying more and more on external, often open-source libraries, and are therefore spending more and more of our time reading through other people's code. Yet the tools don't seem to have caught up.
My technique for reading code is to find out where execution starts and go from there, following what the code does at runtime.<p>If you do it any other way, it won't necessarily make sense. This is really the only way to do it. (Though I'd be interested in hearing other perspectives.)<p>This was a hard-won lesson for me because we programmers tend to make the control flow of our programs start at the <i>bottom</i> of source files.
There are many things that you read that are neither enjoyable, nor easy to understand. Especially at a cursory read. That doesn't make the word less appropriate, nor does it make the word explore any more appropriate. I don't explore a quantum physics textbook, nor do I explore a journal article on tubulins. I read, I jot down notes, and I read some more.
I like Tim Daly's viewpoint that we shouldn't just be writing <i>code</i>, but we should write <i>manuals</i> - giving high level overviews of our problems and specifying their implementation details inside the manuals. His talk "Literate Programming in the Large"[1] covers why.<p>The example he gives in his talk is the Axiom algebra system[2], which was revised to use literate programming style - the source code, with usage examples, is contained entirely within the books.<p>[1]:<a href="https://www.youtube.com/watch?v=Av0PQDVTP4A" rel="nofollow">https://www.youtube.com/watch?v=Av0PQDVTP4A</a>, [sldes]: <a href="http://daly.axiom-developer.org/TimothyDaly_files/publications/DocConf/LiterateSoftwareTalk.pdf" rel="nofollow">http://daly.axiom-developer.org/TimothyDaly_files/publicatio...</a><p>[2]:<a href="https://en.wikipedia.org/wiki/Axiom_%28computer_algebra_system%29" rel="nofollow">https://en.wikipedia.org/wiki/Axiom_%28computer_algebra_syst...</a>
I wrote this a couple of years ago to help read and explore code:<p><a href="http://sherlockcode.com/demos/jquery/" rel="nofollow">http://sherlockcode.com/demos/jquery/</a><p>It's recently seen a spike of interest and I've started working on it again. The beta sign up link is still active if you are interested in getting updates.
Yes, yes, yes! I agree wholeheartedly. Exploratory programming is my new favourite weapon for learning about new codebases, new languages, new everything.<p>I just started a new job at a really interesting agency. I got put on to a 12 month old project, a huge web application, that started life overseas, moved back here to Australia, and according to git-blame has then moved through the hands of nearly 15 developers, a solid 70% don't work here anymore (most were contractors).<p>So, the codebase is a mess. But, with Xdebug and a neat client for it that gives an interactive console when you hit a breakpoint, two weeks later I'm already understanding the twists and turns far better than I ever hoped for!
I agree, I think we need better tools for exploring code. [1]<p>I'm currently envisioning (and trying to build) something where the types/func definitions are hyperlinks, and they jump to definition in an overlaying window similar to when you navigate in Spotify (the web based player). So you can quickly explore something without losing context.<p>[1] <a href="https://twitter.com/shurcooL/status/156526541214457856" rel="nofollow">https://twitter.com/shurcooL/status/156526541214457856</a>
For him has been in the field more than thirty years, this author speaks too easy about some important figures. I stop 'reading' or 'exploring' and start writing —all the CSworld is a scene and MS and Int, merely players. However, I downgrade the article.
The way to explore a bunch of codes is really dependent to how much time and effort one can pour into:<p>- To grep it<p>- To debug it<p>- To read over it<p>- To rewrite it