TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Google Newspaper Archive

124 点作者 jervisfm大约 11 年前

10 条评论

300bps大约 11 年前
I&#x27;ve done a lot of genealogy work for my family name and have used <a href="http://newspaperarchive.com/" rel="nofollow">http:&#x2F;&#x2F;newspaperarchive.com&#x2F;</a> extensively. In comparing a few quick searches, the Google Newspaper Archive is not even comparable. We&#x27;re talking 2 irrelevant hits for Google Newspaper Archive vs. thousands of relevant hits for newspaperarchive.com.<p>And newspaperarchive.com only has a fraction of the newspapers in the country within their records. There is definitely a lot of room for improvement in this space because it&#x27;s such a large task.
评论 #7410369 未加载
IanCal大约 11 年前
This is incredible! The search function works well, which means they&#x27;ve OCR&#x27;d the papers. Is there a way of grabbing this text? I&#x27;ve not seen anything obvious.<p>Also the &quot;link to this article&quot; doesn&#x27;t seem to work for me, although the search had taken me to the article just fine.
andybak大约 11 年前
The search&#x2F;OCR seems patchy. I tried a few (presumably) unique phrases from some and the article wasn&#x27;t found.<p>For example with:<p><a href="http://news.google.com/newspapers?nid=PQY3Tb_h0-cC&amp;dat=19111206&amp;printsec=frontpage&amp;hl=en" rel="nofollow">http:&#x2F;&#x2F;news.google.com&#x2F;newspapers?nid=PQY3Tb_h0-cC&amp;dat=19111...</a><p>I tried:<p>&quot;marshalling that unspeakable parade&quot; (wonderful phrase!)<p>another dull and listless session<p>cattle prices high granby quebec<p>and various other phrases from the home page both with and without quotes. Nothing returned the edition in question.
zxexz大约 11 年前
Holy shit, this is awesome. Lots of papers. LOTS. Even local papers. And the resolution is good!
评论 #7408324 未加载
avighnay大约 11 年前
Just checked it out and unfortunately landed on The Times edition from 1804, the paper was filled with classfieds announcing awards for returning lost slaves, the casual manner of those ads made me lose my appetite for browsing further... very different times they were...
mpclark大约 11 年前
Does anyone know how to submit a newspaper to this archive? I have all 51 editions of a now closed newspaper in PDF format and it would be lovely to find them a home here...
评论 #7409130 未加载
评论 #7409509 未加载
wikiburner大约 11 年前
Slightly related, but does anyone know where to get the equivalent of news.google.com or news.yahoo.com, but with more than 30 days of history? Ideally several years worth.<p>Lexis&#x2F;Nexis appears to only cover print news, and their articles aren&#x27;t timestamped.
评论 #7409127 未加载
mburst大约 11 年前
The Google logo at the top appears misaligned for me. Also when I click it, it redirects to a 404. Nonetheless very cool archive.
eshvk大约 11 年前
I wonder if there is an easy of downloading this and OCRing it. I would love to use this as training material for some ML algos.
devindotcom大约 11 年前
Jesus, this is fantastic. As others have pointed out OCR isn&#x27;t so hot but you should be able to nab topics and names.