TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Wikipedia and Internet Archive partner to fix 1M broken links on Wikipedia

494 点作者 The_ed17超过 8 年前

15 条评论

jacquesm超过 8 年前
In the long term the internet archive will likely be the major supplier of references to Wikipedia. Webpages don't live forever, hopefully the internet archive does. It's an extremely valuable resource, the archive and wikipedia are amongst the most valuable digital assets we have.
评论 #12801540 未加载
评论 #12801459 未加载
评论 #12801490 未加载
评论 #12801553 未加载
评论 #12802421 未加载
eriknstr超过 8 年前
The archive.is guy provides mirrors of rotten links to Wikipedia also, although not as the result of any official agreement with Wikipedia, just on his own initiative, which I think was nice of him.<p>Enclyclopedia Dramatica is generally not a reputable source of truth, being the site that it is, but while looking for some more information on archive.is mirroring of links from Wikipedia articles, I found an article on ED that I found interesting. It is heavily advocating one side of the story but at least it backs it up with some links, which is rather seldom on ED (most links on ED usually go to other pages on ED in my experience).<p><a href="https:&#x2F;&#x2F;encyclopediadramatica.se&#x2F;Archive.is" rel="nofollow">https:&#x2F;&#x2F;encyclopediadramatica.se&#x2F;Archive.is</a>
评论 #12802490 未加载
评论 #12810199 未加载
shortformblog超过 8 年前
Excellent news. Should note that today is the 20th anniversary of the Internet Archive: <a href="https:&#x2F;&#x2F;blog.archive.org&#x2F;2016&#x2F;10&#x2F;26&#x2F;making-the-web-more-reliable-20-years-and-counting&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.archive.org&#x2F;2016&#x2F;10&#x2F;26&#x2F;making-the-web-more-reli...</a>
qwertyuiop924超过 8 年前
I&#x27;m really glad this is happening. Wikipedia needs to clean up their broken links, and this could help the archive get a wider sampling of websites, so as to preserve more data.<p>Websites going offline is a huge problem. For example, the now-famous thread from which sleepsort originated (on 4chan&#x27;s &#x2F;prog&#x2F; textboard) isn&#x27;t archived anywhere: textboard threads are immortal, so nobody thought to archive any threads until dis.4chan.org went down for good.<p>Thankfully, some bright spark managed to save the sqlite databases for most of the boards on dis to the Internet Archive, so I was able to track down the thread eventually.
pmiller2超过 8 年前
This is a huge step forward for Wikipedia as an authoritative source of information. Glad to see this happening. :)<p>OT: I considered applying to the Internet Archive last time I was looking for work, but their office is too hard to commute to coming from the East Bay. :(
评论 #12801584 未加载
ideonexus超过 8 年前
This whole discussion reminds me of how all MySpace content was destroyed in a rash corporate decision years ago. Just like that, five years of the most popular social networking site on the World Wide Web and all its history were wiped out:<p><a href="http:&#x2F;&#x2F;activehistory.ca&#x2F;2013&#x2F;06&#x2F;myspace-is-cool-again-too-bad-they-destroyed-history-along-the-way&#x2F;" rel="nofollow">http:&#x2F;&#x2F;activehistory.ca&#x2F;2013&#x2F;06&#x2F;myspace-is-cool-again-too-ba...</a><p>Unfortunately, the Internet Archive was only able to get the non-logged-in version of the site. All those loud, obnoxious profile pages users spent endless hours working on? We only have oral histories now to remember them.
评论 #12810785 未加载
caf超过 8 年前
It&#x27;d be great if StackOverflow approached the Internet Archive about doing the same for their broken links, too.
sengork超过 8 年前
Internet Archive should look into distributed models such as IPFS for storage of the archived sites.
评论 #12801938 未加载
felipesabino超过 8 年前
As I have clicked in several broken links already, I am wondering how many, absolute number or in percentage, per article are likely to be broken<p>I might be way off, but doesn&#x27;t 1M seems like a low number for wikipedia size? What is that in percentage of total number of links? Does anyone know?
youdontknowtho超过 8 年前
Wow. I really love the internet archive as a project. This is a great usage. Looking forward to see how that will work out.<p>I wonder if they will publish a list of replaced links after the fact?
h1d超过 8 年前
What&#x27;s blocking Wikipedia to just archive the referenced pages on edit?<p>It would be far more reliable than depending on Internet Archive when it may not have the page archived and more likely the time of the archive would differ from the time it was referenced.<p>It would cost some more disk space and bandwidth, which of course is already pressuring them but in turn would greatly improve usability and reliability.
评论 #12805266 未加载
raverbashing超过 8 年前
One corner case that exists: a content is linked on Wikipedia, this content is taken down due to a copyright violation<p>(I suppose Archive.org would be asked to take the content down)
评论 #12804379 未加载
torrent-of-ions超过 8 年前
Why does the headline says &quot;to fix 1M broken links&quot; but the article says it&#x27;s already been done?
评论 #12808273 未加载
45h34jh53k4j超过 8 年前
(red heart)(yellow heart)(green heart)(blue heart) Internet Archive (red heart)(yellow heart)(green heart)(blue emoji)<p>There are fewer more noble pursuits than archiving the sum of human knowledge.
alecco超过 8 年前
On a side note, it makes me very sad how Wikipedia editors are often pushing some political agenda. I&#x27;m relying on it for less and less topics. Clearly nothing that can be affected by US politics or SJW-style controversies.