You need to have a clear understanding of the point of the analysis before you analyze anything. What, specifically, does your team have to produce? How much time and how many people do you have to complete the work? If you're the leading edge of an effort to rewrite 100MLoc, my presumption is that your deliverable is mainly a 'gross anatomy' of the system... a basic description of the major structural components and how they interact with each other. If that's the case, I'd start by looking at the build scripts and the modules they build. Try to make a comprehensive list of major components. You'll get it wrong initially, but you'll need a starting point.<p>The next thing I'd do is take the top level list of modules and start assigning it to individual people within the team. Their responsibility is to produce some kind of top level description of how the individual modules work. A big part of this phase of the effort should be meetings or informal conversations as the per-module analysis progresses. As your team talks among itself, you should be able to find commonality between modules, communication links, etc. The key at this point is to keep it high level, and avoid getting too bogged down in the details. With this much code, there are plenty of details to get bogged down in. As a result, you'll probably have some mysteries about how the code actually works beneath various abstraction layers. Make and update a list of these 'mysteries' and keep it next to your team's list of modules. As you work through the list of modules, some of these will solve themselves, and some will be so obviously important that it's worth a detailed deep dive to really understand what's happening. Either way, there will be times that you have no idea what's going on in the codebase and you'll just have to trust that you'll figure it out later.<p>One final comment I'd like to make is that, as silly as SLoC is as a measure of the size of a software system, you're looking at a large software package. (Bigger than Windows, Facebook, Linux, OSX, etc.) If you take each line of code to have cost $5-10, then the system arguably cost $1B to build in the first place.<p>Because of the size of the system, you shouldn't expect your analysis work to be easy, fast, or cheap. Buy the tools you need to do the work. This means technical and domain training, software, hardware, process development, new staff,... basically whatever you need to make the work happen. You're at the point where long term investments are highly likely to pay off, because your scope is so large and your timeline is entirely in front of you.<p>I'd also highly recommend working this problem from two angles. You can understand the existing system by looking at the code, but you also need to clearly understand the system requirements from the 'business' point of view. If you're doing bottom-up analysis, then some other group needs to be doing top-down. Along those lines, you should also start to thinking about deployment strategies. I highly recommend avoiding a big bang deployment of that large of a system, so there will be some period of time when you're liable to be running both the 'old world' and the 'new world' systems at the same time. Think about how you want to do that...<p>There is lots to think about here, because this is a complex problem. Hopefully, I've given you at least a little bit to think about. Good luck.