The essay "Cracking the Oyster" from Programming Pearls is one of the early recorded instances of the XY problem. The "looking to solve the problem I think I have rather than the one that I actually have" goes back to some of the early days of programming.<p>( The following portion can be found at <a href="https://tfetimes.com/wp-content/uploads/2015/04/ProgrammingPearls2nd.pdf" rel="nofollow">https://tfetimes.com/wp-content/uploads/2015/04/ProgrammingP...</a> and in a less accessible format at <a href="https://archive.org/details/ProgrammingPearls2ndEditionJonBentley/page/n233/mode/2up" rel="nofollow">https://archive.org/details/ProgrammingPearls2ndEditionJonBe...</a> )<p>> Column 1: Cracking the Oyster The programmer’s question was simple: “How do I sort a disk file?” Before I tell you how I made my first mistake, let me give you a chance to do better than I did. What would you have said?<p>> 1.1 A Friendly Conversation<p>> My mistake was to answer his question. I gave him a thumbnail sketch of how to implement a Merge Sort on disk. My suggestion that he dig into an algorithms text met with less than enthusiasm — he was more concerned about solving the problem than furthering his education. I then told him about a disk sorting program in a popular programming book. The program consisted of about two hundred lines of code in a dozen functions; I estimated that implementing and testing the code would have taken the programmer at most a week.<p>> I thought that I had solved his problem, but his hesitation led me back to the right track. The conversation then went something like this, with my questions in italics.<p>> <i>Why do you want to write your own sort at all? Why not use a sort provided by your system?</i><p>> I need the sort in the middle of a large system, and for obscure technical reasons, I can’t use the system file-sorting program.<p>> <i>What exactly are you sorting? How many records are in the file? What is the format of each record?</i><p>> The file contains at most ten million records; each record is a seven-digit integer.<p>> <i>Wait a minute. If the file is that small, why bother going to disk at all? Why not just sort it in main memory?</i><p>> Although the machine has many megabytes of main memory, this function is part of a big system. I expect that I’ll have only about a megabyte free at that point.<p>> <i>Is there anything else you can tell me about the records?</i><p>> Each one is a seven-digit positive integer with no other associated data, and no integer can appear more than once.<p>> The context makes the problem clearer. In the United States, telephone numbers consist of a three-digit “area code” followed by seven additional digits. Telephone calls to numbers with the “toll-free” area code of 800 (the only such code at the time) were not charged.<p>> The programmer was building a small corner of a system for processing such a database, and the integers to be sorted were toll-free telephone numbers. The input file was a list of numbers (with all other information removed), and it was an error to include the same number twice. The desired output was a file of the numbers, sorted in increasing numeric order. The context also defines the performance requirements. During a long session with the system, the user requested a sorted file roughly once an hour and could do nothing until the sort was completed. The sort therefore couldn’t take more than a few minutes, while ten seconds was a more desirable run time.<p>> 1.2 Precise Problem Statement<p>> To the programmer these requirements added up to, “How do I sort a disk file?” Before we attack the problem, let’s arrange what we know in a less biased and more useful form.<p>> ...<p>> 1.5 Principles<p>> The programmer told me about his problem in a phone call; it took us about fifteen minutes to get to the real problem and find the bitmap solution. It took him a couple of hours to implement the program in a few dozen lines of code, which was far superior to the hundreds of lines of code and the week of programming time that we had feared at the start of the phone call. And the program was lightning fast: while a Merge Sort on disk might have taken many minutes, this program took little more than the time to read the input and to write the output — about ten seconds. Solution 3 contains timing details on several programs for the task.<p>> Those facts contain the first lesson from this case study: careful analysis of a small problem can sometimes yield tremendous practical benefits. In this case a few minutes of careful study led to an order of magnitude reduction in code length, programmer time and run time. General Chuck Yeager (the first person to fly faster than sound) praised an airplane’s engine system with the words “simple, few parts, easy to maintain, very strong”; this program shares those attributes. The program’s specialized structure, however, would be hard to modify if certain dimensions of the specifications were changed. In addition to the advertising for clever programming, this case illustrates the following general principles.<p>> <i>The Right Problem.</i> Defining the problem was about ninety percent of this battle — I’m glad that the programmer didn’t settle for the first program I described. Problems 10, 11 and 12 have elegant solutions once you pose the right problem; think hard about them before looking at the hints and solutions.<p>---<p>I highly recommend the book as it makes you think which was especially important in the days of the limited resources of early computing. And while we are spoiled now with powerful processors and plentiful memory we shouldn't let that make us lazy in our designs.