TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Undebt: How We Refactored 3M Lines of Code

245 pointsby vivagnover 8 years ago

25 comments

necubiover 8 years ago
For Java, IntelliJ has a built-in version of this called &quot;structural search and replace&quot; [0]. This is incredibly useful when a library changes an API or you need to refactor a lot of similar code.<p>This feels relatively safe in Java because tooling can staticly know a lot about your code (and can know for sure that a particular call site is the method or class you&#x27;re targeting). I&#x27;ve be terrified to do it in python without a very thorough test suite.<p>[0] <a href="https:&#x2F;&#x2F;www.jetbrains.com&#x2F;help&#x2F;idea&#x2F;2016.2&#x2F;structural-search-and-replace.html" rel="nofollow">https:&#x2F;&#x2F;www.jetbrains.com&#x2F;help&#x2F;idea&#x2F;2016.2&#x2F;structural-search...</a>
评论 #12350401 未加载
wRastel27over 8 years ago
&quot;...time that could be better spent working on new features and shipping new code&quot;<p>Can we please stop putting forth this idea that features &gt;&gt;&gt; reliable product? The amount of dev time that a company will save from removing technical debt will likely be more than the extra sales the company will get from a new feature. I look forward to the day where the executive team comes to the developers and ask why they are working on features instead of cutting down technical debt.
评论 #12350105 未加载
评论 #12350088 未加载
评论 #12350018 未加载
评论 #12350035 未加载
评论 #12350349 未加载
评论 #12377508 未加载
评论 #12354427 未加载
评论 #12353869 未加载
评论 #12350364 未加载
评论 #12351552 未加载
Kendrick2over 8 years ago
How do web applications explode out to 3 Million lines of code? Yelp, to me, looks like a typical CRUD app and I would have been surprised if it were more than 100,000 lines of code. The software I develop is pretty large and typically doesn&#x27;t surpass 40,000 sloc written in-house (i.e. excluding third party libs).<p>Does anyone here maintain such large codebases? Are they truly that big or are people just counting third party code and generated stuff?
评论 #12350722 未加载
评论 #12350193 未加载
评论 #12350554 未加载
评论 #12350161 未加载
评论 #12350276 未加载
评论 #12350321 未加载
评论 #12351366 未加载
评论 #12350866 未加载
评论 #12350159 未加载
评论 #12353643 未加载
评论 #12351684 未加载
评论 #12353666 未加载
评论 #12351594 未加载
评论 #12352002 未加载
评论 #12350329 未加载
评论 #12353141 未加载
评论 #12350150 未加载
vintermannover 8 years ago
It would be nice if some research institution would pay for the rehabilitation of some huge, bloated, ancient, but relatively unimportant app. Ideally by independent teams in parallel.<p>Just to get some real data on what works, rather than anecdotes from veterans.
评论 #12350543 未加载
评论 #12351555 未加载
评论 #12352417 未加载
pmarreckover 8 years ago
I have my doubts that a refactor that consists merely of more complex search and replace actions is a true refactor. Also, this reads more like an ad for their Python tool than about any lessons learned during this refactoring.
wry_discontentover 8 years ago
It seems to me this is only going to handle the most trivial kind of technical debt. This kind of tool can&#x27;t manage the way you organized your codebase, for instance. There&#x27;s more to refactoring than find-and-replace.
评论 #12348550 未加载
yitchelleover 8 years ago
I would also be interested the thought process in deciding what functionality to refactor. Did you review the code and identify areas before unleashing your tool on it?<p>With 3M lines of code gone, it must be terrifying to feel that it may have broken something. How did you ensure that it is still working as before?<p>Edit: Grammatic corrections.
评论 #12350711 未加载
评论 #12354397 未加载
knocteover 8 years ago
This smells as being a need that comes as a consequence of using a dynamically-typed language. Because the example given seems to be just getting rid of the usage of a certain method, to replace with a new one. In a statically typed language, e.g. C#, you just mark the old method with an [Obsolete] attribute and go fix all the warnings. (Granted, a tool that replaces all these usages is also useful, but to me, there are much more complex ways of technical debt than just obsolete methods.)
评论 #12350644 未加载
评论 #12356055 未加载
NikhilVermaover 8 years ago
IMO a much better approach is JSCodeShift, which works based on the AST: <a href="https:&#x2F;&#x2F;github.com&#x2F;facebook&#x2F;jscodeshift" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;facebook&#x2F;jscodeshift</a>
karlheinzover 8 years ago
“Let a 1,000 flowers bloom. Then rip 999 of them out by the roots.”<p>This is paraphrasing of Chairman Mao:<p>&quot;The policy of letting a hundred flowers bloom and a hundred schools of thought contend is designed to promote the flourishing of the arts and the progress of science&quot;<p>And the ripping out by roots part brings labor camps to mind:<p>&quot;After this brief period of liberalization, Mao abruptly changed course. The crackdown continued through 1957 as an Anti-Rightist Campaign against those who were critical of the regime and its ideology. Those targeted were publicly criticized and condemned to prison labor camps.&quot;<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Hundred_Flowers_Campaign" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Hundred_Flowers_Campaign</a>
ajdlinuxover 8 years ago
Related: <a href="http:&#x2F;&#x2F;coccinelle.lip6.fr&#x2F;" rel="nofollow">http:&#x2F;&#x2F;coccinelle.lip6.fr&#x2F;</a><p>Coccinelle is used extensively by Linux kernel developers for a whole tonne of things like this.
crdoconnorover 8 years ago
* Newline at EOF<p>* Double quoted docstring<p>* Remove unused imports.<p>These things are largely cosmetic.
评论 #12351046 未加载
impish19over 8 years ago
Having an intern write a good blog post for the engineering blog is a great recruiting move.
评论 #12350626 未加载
manigandhamover 8 years ago
Refactoring... &quot;puts a massive drain on developer time; time that could be better spent working on new features and shipping new code&quot;<p>This is the wrong way to think. Refactoring will save time by making code faster, more reliable and making it easier to build those new features in the first place. Looks like their biggest issue is bad technical management, not deprecated code.
spapas82over 8 years ago
Was the 3M LOC refactoring for yelp.com? The article doesn&#x27;t say. How could possibly a review site have 3M LOC?
评论 #12350164 未加载
评论 #12350266 未加载
评论 #12350716 未加载
mahyarmover 8 years ago
The amount of code you have in a company is usually a function of the amount of engineers you have.
评论 #12352517 未加载
pmarreckover 8 years ago
A funny and relevant tweet:<p><a href="https:&#x2F;&#x2F;twitter.com&#x2F;php_ceo&#x2F;status&#x2F;765298072691806209" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;php_ceo&#x2F;status&#x2F;765298072691806209</a>
fleaflickerover 8 years ago
Google has a similar tool called Refaster<p><a href="https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;Refaster" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;Refaster</a>
ilostmykeysover 8 years ago
The patterns that Undebt removes could be non-existent in the code base, but the conceptual, algorithmic design decisions and architecture could be all bad and that is what the actual technical debt is. Bad code patterns are just a tiny slice of the problem in most cases.
hendryover 8 years ago
If any of you guys are interested plotting total lines of code changes, do checkout <a href="https:&#x2F;&#x2F;github.com&#x2F;kaihendry&#x2F;graphsloc" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;kaihendry&#x2F;graphsloc</a>
kctess5over 8 years ago
It&#x27;s like codesearch [1].<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;codesearch" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;codesearch</a>
yarperover 8 years ago
The fact that you need a find and replace tool to make sensible changes to your codebase indicates that it&#x27;s hugely out of control already.
ksecover 8 years ago
Interesting, I have always ( wrongly ) remember yelp as a Ruby Rails shop. Does anyone know the stack behind yelp?
评论 #12353130 未加载
avindrothover 8 years ago
Is there a Haskell version for this?
评论 #12351568 未加载
denfromufaover 8 years ago
Why pyparsing, not ply or regex?
评论 #12350095 未加载
评论 #12353076 未加载
评论 #12350289 未加载