TechEcho

9 comments

hawskiover 7 years ago

I always like to see how some API is used in real projects. Sadly GitHub search is mostly useless for this, because of the number of duplicates. Google code search was great. It even supported regexps. Then the was koders.com, now there's also something from ohloh and it's better than GitHub AFAIR.EDIT: ohloh became openhub and now the code search is discontinued. So there is the nonfunctional GitHub search and an open niche for other projects...

评论 #15744661 未加载

评论 #15742980 未加载

评论 #15745317 未加载

coding123over 7 years ago

What really sucks is people committing node_modules, that's just plain wrong.

评论 #15742235 未加载

评论 #15742682 未加载

评论 #15741470 未加载

评论 #15743329 未加载

zbentleyover 7 years ago

Wow, GitHub could save a lot of storage space if they dedup'd across projects/files explicitly, rather than storing Git repos, which is what I'm assuming they do.Even with a good deduping/compressing filesystem, the way git history is stored means that they're probably missing out on a ton of savings here. Eh, it's probably not worth the complexity/deviation from standard Git tooling.

评论 #15741111 未加载

neurotraceover 7 years ago

This is very interesting. I would have liked to see the results for JavaScript when you ignore the node_modules folder. If that's going to count for code duplication then pip dependencies should be included as well.This should definitely be taken as a lesson though: JS needs a better deployment solution. That, or better education on the current solution(s).

评论 #15744237 未加载

评论 #15743581 未加载

hultnerover 7 years ago

Would love to see a follow up where we would see how much duplication existed if we controlled for common dependencies and autogenerated code in conjunction with data on how many repositories are fully cloned (i.e. all code is near identical to another repository).

az0over 7 years ago

Very interesting from a security perspective. So much potentially dangerous code copy-pasted and most of it is probably never updated too. I've personally found some C vulnerabilities in code that I easily found used in many projects by Googling the vulnerable line... Usually not so much to do about it too.

评论 #15743165 未加载

Tommakxover 7 years ago

Would be more interesting to see an analysis of almost equal files - to detect reimplementations of the same thing

inetknghtover 7 years ago

Now predicting automatic software that looks at duplicated code, flags it for violating license agreements, and sues for money.Welcome to the future of copyright trolls.

评论 #15741441 未加载

评论 #15741594 未加载

nihoniumover 7 years ago

In order to prevent code duplication on a global scale, we need more frameworks, like leftpad. :sarc:

评论 #15741491 未加载

9 comments

hawskiover 7 years ago

评论 #15744661 未加载

评论 #15742980 未加载

评论 #15745317 未加载

coding123over 7 years ago

What really sucks is people committing node_modules, that's just plain wrong.

评论 #15742235 未加载

评论 #15742682 未加载

评论 #15741470 未加载

评论 #15743329 未加载

zbentleyover 7 years ago

评论 #15741111 未加载

neurotraceover 7 years ago

评论 #15744237 未加载

评论 #15743581 未加载

hultnerover 7 years ago

az0over 7 years ago

评论 #15743165 未加载

Tommakxover 7 years ago

Would be more interesting to see an analysis of almost equal files - to detect reimplementations of the same thing

inetknghtover 7 years ago

Now predicting automatic software that looks at duplicated code, flags it for violating license agreements, and sues for money.Welcome to the future of copyright trolls.

评论 #15741441 未加载

评论 #15741594 未加载

nihoniumover 7 years ago

In order to prevent code duplication on a global scale, we need more frameworks, like leftpad. :sarc:

评论 #15741491 未加载

DéjàVu: a map of code duplicates on GitHub

9 comments

DéjàVu: a map of code duplicates on GitHub

9 comments