This points to challenges of digital information.<p>Stephen Hawking and his dissertation are high-profile as these things go. The NPR mentions other <i>popular</i> items generating 100s of requests per month. I've run across items with <i>lifetime</i> request counts in the double or triple digits frequently (and suspect I doubled the count on one particular item).<p>More often, though, the truth is that this material <i>simply isn't available online.</i> There are several thesis repositories (either Michigan State or University of Michigan are one, as I recall), and I can <i>frequently</i> turn up a shelf reference via WorldCat ... somewhere.<p>But there's work from surprisingly prominent names in numerous fields that simply isn't available in electronic format. The worst case is for materials from rougly 1924 - 1980: to late to be out of copyright, and too early to have been composed, or converted to, digital formats (and 1980 is an early cut-off date for that, though it's when material seems to start appearing in bulk).<p>This includes PhD dissertations, Masters theses, and numerous academic or other writings, <i>often including government documents not under copyright.</i> Thankfully with Sci-Hub, actual published academic journal articles can be found, freely, with a very high success rate. Particularly painful for me are popular magazine and newspaper items, <i>for which even the indices are very frequently locked behind site-restricted or affiliate-only access.</i><p>The time-and-effort differential of being able to look something up online, vs. travelling many miles to a facility for access, is tremendous. And it absolutely stops a great many incidential queries dead.<p>See Rick Falkvinge's excellent rant about how the KRACK vulnerability was blocked behind corporate-only paywalls for over a decade:<p><a href="https://www.privateinternetaccess.com/blog/2017/10/the-recent-catastrophic-wi-fi-vulnerability-was-in-plain-sight-for-13-years-behind-a-corporate-paywall/" rel="nofollow">https://www.privateinternetaccess.com/blog/2017/10/the-recen...</a><p>Note that the issues here are twofold. One element is the task of scanning and making available documents, and organising the results in a manner useful for search.<p>But much the harm is the direct consequence of the present regime of copyright and paid access to information, AS WELL AS the perverse incentives of advertising-backed media and media manipulation have created a media regime that is actively harmful to society.<p>I'd really like to see the elements of this addressed.