TechEcho

5 comments

lackerover 16 years ago

The contents as text instead of pdf:As I write this column, I'm in the middle of two summer projects; with luck, they'll both be finished by the time you read it. One involves a forensic analysis of over 100,000 lines of old C and assembly code from about 1990, and I have to work on Windows XP. The other is a hack to translate code written in weird language L1 into weird language L2 with a program written in scripting language L3, where none of the L's even existed in 1990; this one uses Linux. Thus it's perhaps a bit surprising that I find myself relying on much the same toolset for these very different tasks.What's ChangedBill Plauger and I wrote Software Tools in 1975, nine years before IEEE Software began publication. Our title was certainly not the first use of the phrase, but the book did help to popularize the idea of tools and show how a small set of simple text-based tools could make programmers more productive. Our toolset was stolen quite explicitly from Unix models. At the time, Unix was barely known outside a tiny community, and even the idea of thinking consciously about software tools seemed new. In fact, we even wrote our programs in a dialect of Fortran because C was barely three years old at the time and we thought we'd sell more copies if we aimed at Fortran programmers. A lot has changed over the past 25 or 30 years.Computers are enormously faster and have vastly more memory and disk space, and we can write much bigger and more interesting programs with all that horsepower. Although C is still widely used, programmers today often prefer languages such as Java and Python that spend memory and time to gain expressiveness and safety, which is almost always a good trade. We develop code differently as well, with powerful integrated development environments (IDEs) such as Visual Studio and Eclipse--complex tools that manage the whole process, showing us all the facets of the code and replacing manuals with online help and syntax completion. Sophisticated frameworks generate boatloads of code for us and glue it all together at the click of a mouse. In principle, we're far better off than we used to be. But when I program, the tools that I use most often, or that I miss the most when they aren't available, are not the fancy IDEs. They're the old stalwarts from the Software Tools and early Unix era, such as grep, diff, sort, wc, and shells. For example, my forensics work requires comparing two versions of the program. How better to compare them than with diff? There are many hundreds of files, so I use find to walk the directory hierarchy and generate lists of files to work with. I want to repeat some sequence of operations--time for a shell script. And of course there's endless grepping to find all the places where some variable is defined or used. The combination of grep and sort brings together things that should be the same but might not be--for instance, a variable that's declared differently in two files, or all the potentially risky #defines. The language translation project uses much the same core set: diff to compare program outputs, grep to find things, the shell to automate regression testing.What We WantWhat do we want from our tools? First and foremost is mechanical advantage: the tool must do some task better than people can, augmenting or replacing our own effort. Grep, which finds patterns of text, is the quintessential example of a good tool: it's dead easy to use, and it searches faster and better than we can. Grep is actually an improvement on many of its successors. I've never figured out how to get Visual Studio or Eclipse to produce a compact list of all the places where a particular string occurs throughout a program. I'm sure experts will be happy to teach me, but that's not much help when the experts are far away or the IDE isn't installed. That leads to the second criterion for a good tool: it should be available everywhere. It's no help if SuperWhatever for Windows offers some wonderful feature but I'm working on Unix. The other direction is better because Unix command-line tools are readily available everywhere. One of the first things I do on a new Windows machine is install Cygwin so that I can get some work done. The universality of the old faithfuls makes them more useful than more powerful systems that are tied to a specific environment or that are so big and complicated that it just takes too long to get started. The third criterion for good tools is that they can be used in unexpected ways, the way we use a screwdriver to pry open a paint can and a hammer to close it up again. One of the most compelling advantages of the old Unix collection is that each one does some generic but focused task (searching, sorting, counting, comparing) but can be endlessly combined with others to perform complicated ad hoc operations. The early Unix literature is full of examples of novel shell programs. Of course, the shell itself is a great example of a generic but focused tool: it concentrates on running programs and encapsulating frequent operations in scripts. It's hard to mix and match programs unless they share some uniform representation of information. In the good old days, that was plain ASCII text, not proprietary binary formats. Naturally, there are also tools to convert nontext representations into text. For my forensics work, one of the most useful is strings, which finds the ASCII text within a binary file. The combination of strings and grep often gives real insight into the contents of some otherwise-inscrutable file, and, if all else fails, od produces a readable view of the raw bits that can even be grepped.A fourth criterion for a good tool is that it not be too specialized--put another way, that it not know too much. IDEs know that you're writing a program in a specific language, so they won't help if you're not; indeed, it can be a real chore to force some nonstandard component into one, like a Yacc grammar as part of a C program. Lest it seem like I'm only complaining about big environments here, even old tools can be messed up. Consider wc, which counts lines, words, and characters. It does a fine job on vanilla text, and it's valuable for a quick assessment of any arbitrary file (an unplanned-for use). But the Linux version of wc has been "improved": by default it thinks it's really counting words in Unicode. So if the input is a nontext file, Linux wc complains about every byte that looks like a broken Unicode character, and it runs like a turtle as a result. A great tool has been damaged because it thinks it knows what you're doing. You can remedy that behavior with the right incantation, but only if a wizard is nearby.There has surely been much progress in tools over the 25 years that IEEE Software has been around, and I wouldn't want to go back in time. But the tools I use today are mostly the same old ones--grep, diff, sort, awk, and friends. This might well mean that I'm a dinosaur stuck in the past. On the other hand, when it comes to doing simple things quickly, I can often have the job done while experts are still waiting for their IDE to start up. Sometimes the old ways are best, and they're certainly worth knowing well.

评论 #444589 未加载

one1plus1oneover 16 years ago

I learnt to program in an IDE, and was born and raised in a Microsoft environment. My entire computing experience has been dominated by Microsoft.But now I'm learning Unix-commands just for the fun and romance of it.It may sound shallow, but one reason I want to learn Unix is because it's somewhat "mysterious" and all the cool hackers I know use Unix.And many people I know in the IT world who can get things done often turn to Unix.Thus, this article re-inforces my motivation to learn Unix -- the operating system of our ancestors.I should note that it doesn't mean I hate windows. In fact as I learn unix commands I am seeing more and more that Windows was born out of Unix and is a "distorted" version of unix. But Windows feels like home to me. So I'll likely always use both: Windows and Unix.

评论 #444332 未加载

lallysinghover 16 years ago

So is this a grep/diff/sort/awk/etc man saying that IDEs tend to suck?It's nice to hear it from Kernighan, but it's hardly surprising. IDEs do suck.

ojbyrneover 16 years ago

I think he should have attributed more importance to regular expressions. They don't even get mentioned, beyond being implied by grep.

评论 #444535 未加载

jlcover 16 years ago

Somehow I expected a bigger beard.

评论 #444851 未加载

5 comments

lackerover 16 years ago

评论 #444589 未加载

one1plus1oneover 16 years ago

评论 #444332 未加载

lallysinghover 16 years ago

So is this a grep/diff/sort/awk/etc man saying that IDEs tend to suck?It's nice to hear it from Kernighan, but it's hardly surprising. IDEs do suck.

ojbyrneover 16 years ago

I think he should have attributed more importance to regular expressions. They don't even get mentioned, beyond being implied by grep.

评论 #444535 未加载

jlcover 16 years ago

Somehow I expected a bigger beard.

评论 #444851 未加载

Brian Kernighan: sometimes the old ways are best

5 comments

Brian Kernighan: sometimes the old ways are best

5 comments