TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Strategies to quickly become productive in an unfamiliar codebase

126 pointsby sabarasabaover 10 years ago

19 comments

scott_sover 10 years ago
I&#x27;m surprised I didn&#x27;t see something similar to what I do: the deep-dive.<p>I start with a relatively high level interface point, such as an important function in a public API. Such functions and methods tend to accomplish easily understandable things. And by &quot;important&quot; I mean something that is fundamental to what the system accomplishes.<p>Then you dive.<p>Your goal is to have a decent understanding of how this fundamental thing is accomplished. You start at the public facing function, then find the actual implementation of that function, and start reading code. If things make sense, you keep going. If you can&#x27;t make sense of it, then you will probably need to start diving into related APIs and - most importantly - data structures.<p>This process will tend to have a point where you have dozens of files open, which have non-trivial relationships with each other, and they are a variety of interfaces and data structures. That&#x27;s okay. You&#x27;re just trying to get a feel for all of it; you&#x27;re not necessarily going for total, complete understanding.<p>What you&#x27;re going for is that <i>Aha!</i> moment where you can feel confident in saying, &quot;Oh, <i>that&#x27;s</i> how it&#x27;s done.&quot; This will tend to happen once you find those fundamental data structures, and have finally pieced together some understanding of how they all fit together. Once you&#x27;ve had the <i>Aha!</i> moment, you can start to trace the results back out, to make sure that is how the thing is accomplished, or what is returned. I do this with all large codebases I encounter that I want to understand. It&#x27;s quite fun to do this with the Linux source code.<p>My philosophy is that &quot;It&#x27;s all just code&quot;, which means that with enough patience, it&#x27;s all understandable. Sometimes a good strategy is to just start diving into it.
评论 #8263796 未加载
评论 #8264070 未加载
shadowmintover 10 years ago
Am I the only one reading this thinking &quot;as written by someone who seldom dives into foreign code bases&quot;.<p>After a few months you&#x27;ll be familiar with it?<p>Wow, I&#x27;m lucky to get 8 hours on a new code base before I have to ship a bugfix. Months?? O_o luxury. Sip your pina colada as you write your immediately out dated documentation. Great advice.<p>No, that code base was probably written by several people <i>some</i> of whom knew what they were doing, some of whom were just &#x27;getting the job done&#x27; and some of whom were idiots.<p>Probably, the only useful suggestion in that list is code reviews. Hack a fix in, get someone who seems clued up to review and suggest a better way.<p>Look for logging, thats the first thing I do; you&#x27;re probably not the first person to be given this code to work with, and if you&#x27;re lucky the last folk(s) made some debugging tools. If not, be a pro and leave some good ones behind when you go.<p>...not documentation.
评论 #8265370 未加载
评论 #8264696 未加载
评论 #8264697 未加载
goshxover 10 years ago
From my experience, the &quot;be humble&quot; strategy is by far the most important one. Many developers tend to dislike whatever piece of code that was not written by them before even looking into the actual code or they get discouraged because they have a hard time learning how it works. For whoever is in this situation, be humble... pretend your way of writing software is not the only way and see if you can learn something. You may get surprised.
paperworkover 10 years ago
I once had to get my head around over 120k lines of complex, concurrent, buggy spaghetti Java code (for a real-time trading system).<p>My first attempt was to reverse engineer the code into a UML diagram. For some reason I keep making this mistake. A messy code base will result in an extremely messy diagram. It can give a few insights, but between finding a tool which will work and trying to make sense of a tangled mess of lines, visual diagrams usually aren&#x27;t worth the time.<p>I found that a tool called Chronon was somewhat useful (google &quot;DVR for Java&quot;). This tools just records a single run of a program. It is great for going forwards AND BACKWARDS and you step through the code, take a look at different threads, state of various objects, etc.<p>My strategy was to run the server and have it execute a small and simple bit of functionality (execute a single order). Follow it all the way from input to output. Make the scenario a bit more complex and follow that through to completion. This way you get to understand the core functionality, edge case code and start to get a sense of performance enhancements, etc.<p>I found myself making steady progress and fixing a number of bugs, until I hit heisenbugs, caused by overly clever concurrency&#x2F;object pooling. It is enough to drain your soul :)
metatationover 10 years ago
Another strategy that I&#x27;ve often used is to just fix a few bugs that seem to be in the vicinity of the area you want to familiarize with. Heck, I even do this on my own codebases that I haven&#x27;t touched in a while.<p>I find it helps focus the mind, provides a clear definition of success and forces you to think about a specific area of the code without requiring too much context.
asuffieldover 10 years ago
I&#x27;ve done this a lot in my career. My strategy is simple: start by finding a trivial bug, and fix it. Then find a slightly less trivial bug. Do this four or five times. Don&#x27;t ask anybody about anything unless I&#x27;m really lost, just chase the thread and let the bugs lead me through the active code paths.<p>After doing a few of these I&#x27;ve got a fair understanding of the structure of the code and can figure out where to go next.
inversionover 10 years ago
I find it helps to focus on a &#x27;successful&#x27; path through the code, starting with an important entry-point and ignoring error conditions and validation. Once I&#x27;ve covered a fair chunk of what the software does, I can go back and delve into the parts I didn&#x27;t understand.<p>I&#x27;m also wary of comments as I find they can often have &#x27;drifted&#x27; in old code and become misleading. I make detailed notes of things that look dodgy or could be improved separate from the code.<p>I&#x27;d add that if the project has poor-to-no build scripts that require a lot of manual steps it&#x27;s worth at least bandaging it up with a shell script early on.<p>I recently worked on a project that had many separate modules with independent build scripts, requiring copy-and-pasting built artifacts that others depended on, and several manual configuration tweaks post-packaging. That stuff is very tedious and sucks your energy. It&#x27;s not generally a priority to rewrite all the build scripts, but if you&#x27;re editing several modules it&#x27;s worth being able to do a one-shot build from early on.
simmonsover 10 years ago
There&#x27;s a good episode of Software Engineering Radio on this topic, as well:<p>Episode 148: Software Archaeology with Dave Thomas <a href="http://www.se-radio.net/2009/11/episode-148-software-archaeology-with-dave-thomas/" rel="nofollow">http:&#x2F;&#x2F;www.se-radio.net&#x2F;2009&#x2F;11&#x2F;episode-148-software-archaeo...</a>
caissyover 10 years ago
I think that the two most important strategies are to actually pair with someone and ask questions after doing some research.<p>Being able to do some pair programming helped me to understand a new code base, in a language I never used before bits by bits. I could ask questions and actually helped on the issue. Asking questions is important, but as stated on the post, you should try to find to answer by yourself first. I&#x27;ll actually time myself for 10 minutes, and if I can&#x27;t find an answer, I&#x27;ll just poke the most prominent person based on a git blame of the file that is related to my question.<p>This is my second week working on a Ruby codebase, without any prior experience with Ruby, or Rails (I&#x27;ve mostly been doing Python with Django and Pyramid for the past 3 years). I managed to get quickly up to speed this way.
karlbover 10 years ago
What types of tests—or testing software—could be run on a WordPress site? For example, after we update a plugin on a WordPress site we manually check that the site hasn&#x27;t changed in appearance. What software – or methodology – would you recommend for automating this game of spot-the-difference?<p>The nearest I&#x27;ve found (but not tried) are:<p>Screenster: <a href="http://www.creamtec.com/products/screenster/index.html" rel="nofollow">http:&#x2F;&#x2F;www.creamtec.com&#x2F;products&#x2F;screenster&#x2F;index.html</a><p>Wraith: <a href="https://github.com/BBC-News/wraith" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;BBC-News&#x2F;wraith</a>
评论 #8264223 未加载
scottmwintersover 10 years ago
I recently started a new job with a new language and did a few of these steps. The only one that has really helped me learn this codebase is &quot;make something&quot;.<p>Making unit tests can be great if you already understand the language (and the IDE), but when you&#x27;re new to xcode, dont expect a unit test to make any more sense than the code.<p>Pairing can be a great tool, if you are pairing with the right person. You can&#x27;t always find the right programmer (with the time or care) to sit down and work with you like that.<p>I was certainly humbled by this process, whether by choice or not, and asked my share of dumb questions, but I don&#x27;t think you can really understand a codebase until you work on it
cratermoonover 10 years ago
Working Effectively with Legacy Code, by Michael Feathers, is still my go-to reference, even though it&#x27;s now ten years old. <a href="https://cmdev.com/isbn/0131177052" rel="nofollow">https:&#x2F;&#x2F;cmdev.com&#x2F;isbn&#x2F;0131177052</a>
jakozaurover 10 years ago
Nice 10kf overview, but more hand on tips:<p>1. Run the program, look at stacktraces&#x2F;profiler. Sometimes it&#x27;s easier to analyze runtime and figure out what the program is doing and what are the main&#x2F;common patterns.<p>E.g. Java - jstack, C++ - gdb, gprof<p>2. Use some static tools to get call&#x2F;caller graph and be able to browse program quickly. Jump between definitions, see what&#x27;s used, what&#x27;s not.<p>E.g. C++ - docgen, Java - most modern IDE will do just fine
评论 #8263632 未加载
zorboover 10 years ago
How can you write tests if you&#x27;re familiar with what the code is supposed to do?
评论 #8264316 未加载
wsc981over 10 years ago
For my current employer I landed on the job during their regression test sprint.<p>Just running through the tests and communicating a lot with testers and developers helped me understand how the app (ought to) behave.
_RPMover 10 years ago
&quot;Ask questions&quot;, I found that if you ask too many questions, management starts to get annoyed and actively avoids you and or has attitude in their response.
LilBibby2342over 10 years ago
Somewhat adjacent to the conversation, but Bowery - <a href="http://bowery.io/" rel="nofollow">http:&#x2F;&#x2F;bowery.io&#x2F;</a> - is building a toolset for getting a full dev environment in a few seconds.<p>Saw them on the &quot;Made in NY&quot; Product Hunt collection yesterday. Has anyone played with Bowery?
phpnodeover 10 years ago
Can&#x27;t over emphasise the importance of a good IDE for this kind of thing, even if you&#x27;re a hardened vim or emacs zealot. A tool like IntelliJ IDEA makes navigating new&#x2F;large codebases a million times easier.
评论 #8265302 未加载
EGregover 10 years ago
This is very useful, and I think I will post or link to something like that on my own open source project&#x27;s page.
评论 #8263732 未加载