TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Internet Archive Scholar: Search Millions of Research Papers

342 pointsby bnewboldabout 4 years ago

14 comments

bnewboldabout 4 years ago
This service was hinted at back in September, but is now formally announced and live at <a href="https:&#x2F;&#x2F;scholar.archive.org" rel="nofollow">https:&#x2F;&#x2F;scholar.archive.org</a><p>Related previous post: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=24485444" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=24485444</a><p>Much of the catalog functionality can be accessed from the fatcat.wiki API (<a href="https:&#x2F;&#x2F;api.fatcat.wiki&#x2F;redoc" rel="nofollow">https:&#x2F;&#x2F;api.fatcat.wiki&#x2F;redoc</a>). Scholar adds a search index over the body content of papers, and we are still thinking through how to make this available through a public API without slowing down query latency even more.<p>Folks here might also be interested in this CLI for interfacing with the catalog and making edits: <a href="https:&#x2F;&#x2F;gitlab.com&#x2F;bnewbold&#x2F;fatcat-cli" rel="nofollow">https:&#x2F;&#x2F;gitlab.com&#x2F;bnewbold&#x2F;fatcat-cli</a>
评论 #26403376 未加载
评论 #26412366 未加载
评论 #26412828 未加载
marcodiegoabout 4 years ago
The internet archive is becoming an alternative good internet. It has a web archive, film archive, software archive, media archive... and now research papers archive. That is the internet as a giant library as we dreamed in early 90&#x27;s.
评论 #26403943 未加载
评论 #26404833 未加载
capablewebabout 4 years ago
Internet Archive strikes again! I love Internet Archive, not just for archiving websites but for archiving everything and making it easily accessible. This is another great service that&#x27;ll help a lot of researchers and hobby-researchers, which is lovely to see.<p>Don&#x27;t forget to donate if you also like Internet Archive, they need every penny: <a href="https:&#x2F;&#x2F;archive.org&#x2F;donate&#x2F;?origin=hn" rel="nofollow">https:&#x2F;&#x2F;archive.org&#x2F;donate&#x2F;?origin=hn</a>
betamaxthetapeabout 4 years ago
This is amazing. I had a play around with it whilst it was in beta, and was blown away by the variety of papers returned. On a whim I searched for a very obscure topic that I&#x27;d researched before (just for personal interest) in the past using worldcat &#x2F; google scholar, and to my surprise was presented with several highly relevant papers I&#x27;d never come across before, that were <i>exactly</i> what I was looking for.
nlabout 4 years ago
This seems pretty good.<p>In computer science we are pretty lucky because open access is the norm.<p>I checked a few well known exceptions, and this seems to find them ok.<p>&quot;Mastering the game of Go without human knowledge&quot; (Deepmind in Nature): <a href="https:&#x2F;&#x2F;scholar.archive.org&#x2F;search?q=key:work_yqdj7vjbefg7hhth4ccyptnddu" rel="nofollow">https:&#x2F;&#x2F;scholar.archive.org&#x2F;search?q=key:work_yqdj7vjbefg7hh...</a><p>&quot;Typing candidate answers using type coercion&quot; (IBM Watson special edition, IEEE IBM Systems Journals): <a href="https:&#x2F;&#x2F;scholar.archive.org&#x2F;search?q=key:work_dym4lqay5fcdxogcogheg44iua" rel="nofollow">https:&#x2F;&#x2F;scholar.archive.org&#x2F;search?q=key:work_dym4lqay5fcdxo...</a>
nathiasabout 4 years ago
archive.org is really one of the few things still good on the internet, while studying it has been invaluable for my studies, I can&#x27;t imagine what the previous generations that could only access 5% of sources were even doing.
8bitsruleabout 4 years ago
Oh yeah! Tried this on several specific topics I&#x27;ve looked at recently (2 years ago, 7ya, and 150ya) and the results were fast and on the mark. I&#x27;ll certainly favor using Scholar over IA searches. Congratulations!
pasttense01about 4 years ago
How does this compare to BASE and why isn&#x27;t BASE used as a source?<p>&quot;BASE is one of the world&#x27;s most voluminous search engines especially for academic web resources. BASE provides more than 240 million documents from more than 8,000 content providers. You can access the full texts of about 60% of the indexed documents for free (Open Access). BASE is operated by Bielefeld University Library.&quot;<p><a href="https:&#x2F;&#x2F;www.base-search.net&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.base-search.net&#x2F;</a>
评论 #26428223 未加载
评论 #26421916 未加载
masswerkabout 4 years ago
This is nice! I just managed to find an article, I couldn’t find with Google.<p>Thus, I was able to solve the PDP-1 &quot;Amherst Mystery&quot; [1]: <a href="https:&#x2F;&#x2F;www.masswerk.at&#x2F;nowgobang&#x2F;2021&#x2F;pdp1-spotting#update" rel="nofollow">https:&#x2F;&#x2F;www.masswerk.at&#x2F;nowgobang&#x2F;2021&#x2F;pdp1-spotting#update</a><p>[1] <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=26313124" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=26313124</a>
评论 #26407549 未加载
sundarurfriendabout 4 years ago
(OffTopic) All this talk about the logo here made me check the page out, instead of moving on after reading just the comments as I might otherwise have done. Perhaps that&#x27;s a HN strategy to use, to get people to actually click through - add a bikesheddy thing to the page that&#x27;s likely to be divisive, but doesn&#x27;t require thought. Gives us a cheap way to have an opinion, and thus an incentive to click!
tasogareabout 4 years ago
What are the differences and advantages over Sci-Hub?
评论 #26406873 未加载
carbocationabout 4 years ago
Interesting. For my field (cardiovascular genetics), the results weren&#x27;t really what I was expecting. I think that my expectations probably fit pretty well with a PageRank graph of citations. So my guess is that the &quot;relevancy&quot; is semantic only?
endisneighabout 4 years ago
I&#x27;m curious, how does the Internet Archive handle copyright with all of its services?
评论 #26411280 未加载
BugsJustFindMeabout 4 years ago
I couldn&#x27;t find a list of what sources (like which journals) they&#x27;re archiving from. Does anyone know where to find that? It would be nice to see what subject categories the archive covers.
评论 #26406059 未加载