Analysis of longevity of code across many popular projects

202 pointsby alexlikeits1999over 8 years ago

25 comments

cypharover 8 years ago

It looks like the exponential model isn't a good fit at all -- in all cases it undershoots the decay at the start of the graph and overshoots at the tail end. So while it might "look close" there is some systematic that your model doesn't account for. In particular, I don't agree that all code in a codebase has a constant risk of being replaced -- most projects have different components that are developed at different rates. Some components are legacy code that is likely to never change, while other parts are under rapid development. In fact, I'd argue that's why the tail is so long -- legacy code is called "legacy" for a reason. And the tip of the graph dives down so quickly because code being rapidly developed has a higher chance of being replaced.

评论 #13114589 未加载

评论 #13115669 未加载

评论 #13115007 未加载

jsjohnstover 8 years ago

I'd posit that the reason is a nuance of #2, more thought was put into older code on the design of how it should work before making it work. Now we write code so fast that we have to scrap it all and do it again a second time to fix the mistakes of the first time [0]. I'm of the mind that upfront planning would've likely taken less time, but that's simply my opinion and I don't have anything to back it up besides anecdotal experience. The current practice of "move fast and break things" very well could be a better approach.[0] I'm only mentioning this footnote due to the article picking on Angular (fairly or unfairly), but the point I made that this is footnoting is relevant to them potentially.

评论 #13114586 未加载

yoavfrover 8 years ago

Results for WordPress <a href="https://blog.yoavfarhi.com/2016/12/06/half-life-wordpress-code/" rel="nofollow">https://blog.yoavfarhi.com/2016/12/06/half-life-wordpress-co...</a>

评论 #13117735 未加载

aamederenover 8 years ago

Just Amazing.I wonder if there any research articles discussing the correlation between code-change and other metrics like product quality, change frequency of team members, estimation success, etc.

评论 #13114744 未加载

评论 #13114703 未加载

georgeecollinsover 8 years ago

Look at how consistently the lines of code grow for these projects. I doubt that is surprising but think about the implications. Linux is a pretty old open source project and still on balance the lines of code just grow.How many lines of code will it be in fifty years? Will we have to come up with new systems to manage the fact that individuals only really understand smaller and smaller pieces of it? Will it reach a mass where like a black hole it collapses from some uncomprehensible failure?There have never been things like this that just grow in complexity forever.

评论 #13117358 未加载

评论 #13116982 未加载

评论 #13116719 未加载

评论 #13114940 未加载

aristusover 8 years ago

My rule of thumb is that if a line of code survives its first five years it'll live forever. The age of a piece of code is the single greatest predictor of its future.

skeltoacover 8 years ago

Code from year one may still be the same code but when it gets moved or reformatted its cohort is updated. If a bad change is reverted, will the cohort for those lines of code also be reverted? The effect of understated longevity is not so obvious when it is gradual and organic. Sometimes an event in a project's history makes the effect very obvious.<a href="https://blog.yoavfarhi.com/2016/12/06/half-life-wordpress-code/" rel="nofollow">https://blog.yoavfarhi.com/2016/12/06/half-life-wordpress-co...</a>

inputcoffeeover 8 years ago

I really like this. Suppose we were to accept the suggestion that perhaps linearly decaying code is better built, and more robust, than exponentially decaying code.Would this give newbies a great new tool to answer their question of which framework or language to learn?Rails or Django? Django lasts longer. Angular or React? VueJs just a trend?You could answer all these questions with this kind of analysis.If someone wants to make a genuine contribution, a blog post contrasting the various decays of Javascript frameworks would be a hit.

评论 #13115501 未加载

puredangerover 8 years ago

Clojure takes a remarkably stable and additive approach to maintenance and growth. Graph for it is here: <a href="http://imgur.com/a/rH8DC" rel="nofollow">http://imgur.com/a/rH8DC</a>

Insanityover 8 years ago

I'm not a dad and I actually appreciated the "Git of Theseus" pun. The ship is an interesting thought experience on identity, maybe you can modernize it to introduce philosophy to computer science students ^^But more on-topic, nice article, well done!

koja86over 8 years ago

Good job and great tool. Thanks.This might actually be an interesting metric in regards to project architecture and/or project management.Totally agree that exponential model explanatory power is great.

shivpatover 8 years ago

Great stuff here.Definitely some merit to reason #3 - people are more willing to work on something that they can easily build on top of.

msluyterover 8 years ago

A lot of projects I've worked on have utility libraries consisting of mostly stateless, pure functions -- I have a theory that these constitute some of the longest lived code. That, and database models, which tend to be easier to expand than to contract. I'd be curious to see some analysis along these lines.

评论 #13116881 未加载

alenmilkover 8 years ago

Interesting, but it is not surprising that git and redis are more stable than node and angular. Git and redis are well defined problems that won't change that much. Angular is a framework and node is a platform. They should change more. But then again... javascript fatigue is a thing from what I heard.

azag0over 8 years ago

The simplest explanation is that the exponential model is not a good one (does not correspond to the underlying dynamics), and so the half time value is not an inherent (time-independent) property of the codebase, but depends on its age. It seems to me that in most projects, the code evolves quickly in the beginning and then stabilizes on a slower linear decay. This would explain the observed dependence of the fitted half life on age. It might be more meaningful to fit a linear dependence at the beginning of the project and in the asymptotic regime and also look at their ratio. This should be more stable, would tell you how well was the project designed from the beginning, and would also indicate whether the project has already stabilized or not.

评论 #13112959 未加载

qumeover 8 years ago

Would love to see this restricted to which lines are actually executed in a typical use. I.e. ignore dead code which is still in the repo.Probably wouldn't be too hard for the interpreted languages.

jtiggerover 8 years ago

I wish we could capture the "inventiveness" of a particular project — how well the problem was understood when the project initiated.There had been _many_ *nix'es by 2006, so the territory had significant prior art and with it collective deep understanding of the problems being solved. Angular sprouted alongside a number of other SPA frameworks in an ecosystem that was experiencing a "growth-spurt" (using that term loosely) — lots of variables.

mmerickelover 8 years ago

This is awesome. I ran it on pyramid for fun. <a href="http://imgur.com/a/KZ9KR" rel="nofollow">http://imgur.com/a/KZ9KR</a>

评论 #13112976 未加载

jtiggerover 8 years ago

Another feature of the model could be stability of product vision. Is there a correlation between the half-life of committer membership and that of the code? How has the problem space of the product changed over time?Perhaps we could talk about "intrinsic churn" vs. "accidental churn". The former results from the codebase keeping up with the "drift" in the problem space; the latter comes from having to learn.

malkiaover 8 years ago

There are still AAA games shipped with code written by id Software 20 or more years ago. Tools too :)

esmiover 8 years ago

Super interesting.But it doesn't seem fair to directly compare 2005 14 year old stable 2.6 Linux to projects which did basically foundational initial releases and not mention it.

twelvechairsover 8 years ago

Great work. Would be interesting to compare by language - see if particular languages need more or less refactoring.

throwawygybjover 8 years ago

So its true... Angular is the code abyss, my colleagues said it was a legend but I have seen it with mine own eyes.Thank you

beefmanover 8 years ago

Would be interesting to see the results as a function of repository size.

xiphiasover 8 years ago

Maybe Linux is not rewritten because it doesn't have tests

评论 #13112965 未加载

25 comments

cypharover 8 years ago

评论 #13114589 未加载

评论 #13115669 未加载

评论 #13115007 未加载

jsjohnstover 8 years ago

评论 #13114586 未加载

yoavfrover 8 years ago

Results for WordPress <a href="https://blog.yoavfarhi.com/2016/12/06/half-life-wordpress-code/" rel="nofollow">https://blog.yoavfarhi.com/2016/12/06/half-life-wordpress-co...</a>

评论 #13117735 未加载

aamederenover 8 years ago

评论 #13114744 未加载

评论 #13114703 未加载

georgeecollinsover 8 years ago

评论 #13117358 未加载

评论 #13116982 未加载

评论 #13116719 未加载

评论 #13114940 未加载

aristusover 8 years ago

My rule of thumb is that if a line of code survives its first five years it'll live forever. The age of a piece of code is the single greatest predictor of its future.

skeltoacover 8 years ago

inputcoffeeover 8 years ago

评论 #13115501 未加载

puredangerover 8 years ago

Clojure takes a remarkably stable and additive approach to maintenance and growth. Graph for it is here: <a href="http://imgur.com/a/rH8DC" rel="nofollow">http://imgur.com/a/rH8DC</a>

Insanityover 8 years ago

koja86over 8 years ago

shivpatover 8 years ago

Great stuff here.Definitely some merit to reason #3 - people are more willing to work on something that they can easily build on top of.

msluyterover 8 years ago

评论 #13116881 未加载

alenmilkover 8 years ago

azag0over 8 years ago

评论 #13112959 未加载

qumeover 8 years ago

jtiggerover 8 years ago

mmerickelover 8 years ago

This is awesome. I ran it on pyramid for fun. <a href="http://imgur.com/a/KZ9KR" rel="nofollow">http://imgur.com/a/KZ9KR</a>

评论 #13112976 未加载

jtiggerover 8 years ago

malkiaover 8 years ago

There are still AAA games shipped with code written by id Software 20 or more years ago. Tools too :)

esmiover 8 years ago

Super interesting.But it doesn't seem fair to directly compare 2005 14 year old stable 2.6 Linux to projects which did basically foundational initial releases and not mention it.

twelvechairsover 8 years ago

Great work. Would be interesting to compare by language - see if particular languages need more or less refactoring.

throwawygybjover 8 years ago

So its true... Angular is the code abyss, my colleagues said it was a legend but I have seen it with mine own eyes.Thank you

beefmanover 8 years ago

Would be interesting to see the results as a function of repository size.

xiphiasover 8 years ago

Maybe Linux is not rewritten because it doesn't have tests

评论 #13112965 未加载