Ask HN: How do you get started on a code base?

12 pointsby hgargover 2 years ago

What are some tricks and methods you use when you are getting yourself familiar with a new code base? How do you shorten the learning curve?

8 comments

Leftiumover 2 years ago

Start with the data:"Show me your [code] and conceal your [data structures], and I shall continue to be mystified. Show me your [data structures], and I won't usually need your [code]; it'll be obvious."<a href="https://hw.leftium.com/#/item/10293795" rel="nofollow">https://hw.leftium.com/#/item/10293795</a>---Also I find opening a project in VS code makes browsing/searching through it much easier with features like:- F12 to jump to definition/references- Shift-F global project search

评论 #34433763 未加载

binarymaxover 2 years ago

In my experience, there are no shortcuts. The approach is always fix some bugs, write some test coverage, and add some small features. Depending on the size of the project, it might be quick or might take a long time.One problem you might face if the codebase is really old, are the multitude of styles and patterns that exist in different areas of the project. For those, it’s good to start in one area, and slowly reach out to other areas over time.--EDIT-- Thinking about this some more, here are some other tips: First and foremost - remember Chesterton's Fence. If you see something that looks strange and unnecessary - don't remove it, it might be load bearing! Also, if you are working in an IDE that has a step-through debugger (common in C#, Java, and others), then set an early breakpoint for an API call and follow it all the way down the rabbit hole...you'll see things happening that you never would have found by manually poking around source.

dyingkneepadover 2 years ago

A lot of times you simply don't understand what the program is even trying to implement, so you fail to understand the codebase. You have to understand what is being implemented: the inputs and outputs of the codebase. For example, if you want to tackle a graphics driver, you should probably learn the API that it's implementing (OpenGL, Vulkan, Direct X 12, etc.), which is the input to the system, and you should also learn the OS and Hardware interfaces, which are the output to the system.Every time you need to learn a subsystem of the codebase, the same concept applies. Learning the shading language compiler? Well, learn the shading language itself and then learn the Hardware's programming interface for shaders. Learning memory management? Learn how programs manage their memory, learn how the hardware managers memory. And so on.Want to dive into a library? Well, learn how to use the library. Then learn whatever interface the library is on top of. Then imagine how you would implement the library yourself, that should give you a gigantic idea of how the library works and you would probably start recognizing what each piece of the code does.Also, ctags, cscope and Vim's ctrl+] are essential.

karmakazeover 2 years ago

The data. I study the data. The database model. The cardinalities, unique keys. Dataflow. If you understand a good chunk of that, other things can fit in around it.As for understanding code, deep dives, fix bugs, try to write a feature. After doing a number of these you start to get a picture of some details, characteristics of the programming style, and start making connections at higher levels.

electrondoodover 2 years ago

I ask a few focusing questions:"What does the business do, what do the users care about, and what value does this system add for those users?""What 'things' does this system involve?" Print out the DB schema + classes to see all of the entities represented in the system."What are the things this system does?" Read the unit tests. They describe the intended behavior + the minimal context required for that behavior."How do I see output for the changes I make when I work?" Ask a few of the other devs what their feedback loop looks like during development.Then I make a copy of the repo and comment the shit out of it for myself. I summarize the existing code, and include any observations, thoughts, whatever.One more: I keep a text file with a list of "activating questions" prior to any of this, which I continually add to, and write answers in. I use this to batch inquiries for other devs, who I set up 20-minute sessions with to answer my questions, so I don't pepper them continually with interruptions.

Jtsummersover 2 years ago

Dissect and vivisect it.I turn a new-to-me codebase into a literate program using org-mode, creating all the links I need and documenting as I go (since most systems are grossly underdocumented). Writing tests and using a debugger/tracer/whatever to follow the executable's flow. I have git ignore the .org files and periodically tangle the source files to ensure I haven't introduced unintended changes (as seen using git diff). The org files are for me, and I turn them into documentation for other people if the whole team is new to a system (like when it transfers from one org to another for whatever reason).I don't think there's really any way to shorten the curve, but this has the effect of focusing my efforts at understanding. Previously I'd just poke about haphazardly, turns out that's too unfocused to be effective for me. It leads to a narrow understanding of the system instead of the desired broad understanding of it.

atomicnatureover 2 years ago

Start with tests, use the debugger to explore different "execution paths", try to have a small goal (minor modification), and work towards that.

thedevindevopsover 2 years ago

Start with the Interfaces, the subsystem boundaries will tell you more about the system than most other parts. I usually pick a few of the biggest ones or ones related to the 'easy' task that I've been assigned as the new guy. Give each one a page in your notebook and start making notes. Go to their implementations and note what other interfaces they communicate with, check the tests for those interfaces and how they're Mocked. Make special notes when the implementations interact with external systems (other servers, web apis, databases, etc) keep going gradually till you've mapped enough of the system that you feel comfortable