TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

ISBNdb dump – how many books are preserved forever?

174 点作者 pilimi_anna超过 2 年前

10 条评论

billblack超过 2 年前
As someone who would like to publish, my main concern with ISBN&#x27;s is the cost, because publishers are required to assign an ISBN to every item in their catalog.<p>Section 6.1 of the ISBN International User Manual &quot;A separate ISBN shall be assigned to each separate monographic publication or separate edition or format of a monographic publication issued by a publisher.&quot;<p>This would not be a problem if the numbers were more affordable.
评论 #33413579 未加载
评论 #33412271 未加载
bloak超过 2 年前
Google has claimed that about 130 million books have been published (that factoid is all over the web). The number of 10-digit ISBNs is 1000 million (there&#x27;s a check digit) and people have only just started using 13-digit ISBNs that start with 979 instead of 978; but of course there must be lots of wasted ISBNs, for example when a publisher optimistically buys a big block and then goes bankrupt. Both those numbers suggest that the &quot;ISBNdb&quot; with less than 31 million ISBNs is far from complete.<p>The frequency of each top-level prefix (which tells you the geographical or language region) would be interesting. That would the first thing I&#x27;d calculate if I had the data on my disc.
评论 #33413780 未加载
评论 #33413936 未加载
评论 #33413839 未加载
评论 #33413299 未加载
评论 #33413279 未加载
Tomte超过 2 年前
ISBNs are supposed to be unique, but they aren&#x27;t. Publishers reuse ISBNs, by mistake if they are following the rules, or sometimes intentionally.<p>It&#x27;s not super common, but common enough that I ran across that problem when scanning in my bookshelf years ago.
Archelaos超过 2 年前
The problem with counting &quot;books&quot; is that the term is used in so many different ways, that one might end up with estimates that differ by several magnitudes depending how narrow or wide a definition or charactierization one adopts. How many books is a bible? One or around 80.[1] When there is a new minor edition, do we count no, one or 80 new books? Some of this 80 &quot;books&quot; are only letters and less than a page or only a few pages long. Shall we count them all as &quot;books&quot;? If we do so, should we than count each letter of a modern published correspondence as a single &quot;book&quot;? Poems were often published as very small booklets, but for prominent writers you may be able to purchase their &quot;complete works&quot; in a single more or less thick volumn, or the very same text in one thick volumn or a few more handy volumns. How should we count this?<p>&gt; Physical copies. Obviously this is not very helpful, since they’re just duplicates of the same material.<p>Alas, this is quite often not the case, in particular for older books for various reasons, for example copies were bound from sheets of different print runs that used freshly assembled typesettings containing accidential or deliberate variations, sometimes sheets were missing or the order of pages is not correct, etc., etc.[2] For important &quot;books&quot; we should therefore digitize every available copy.<p>As great it would be to have 129,864,880 &quot;books&quot; scanned, this would be just an initial phase. We would need a quality control: Is the resolution of the scans really always sufficient? Are the colours correctly represented (includes every scan a standard colour chart for comparison)? What about watermarks (they are extremly important for dating old books)? ... ...<p>Besides, I personally prefer to speak of &quot;making books digitally available&quot; rather than of &quot;preserving&quot; them, because many features of a physical copy are impossible to preserve digitally: chemical coposition, (bio-)chemical traces, the DNA of parchment or animal bindings, their texture, how it feels to handle them, their visual appearance under different illuminations ... ...<p>[1] The number varies from denomination to denomination.<p>[2] And even renowned contemporary publishers sometimes silently correct errors without changing the numbering of the edition.
pugworthy超过 2 年前
Define &quot;forever&quot; in this context? 10 years? 100? 1000?<p>It&#x27;s a legit question to answer.
评论 #33416358 未加载
photochemsyn超过 2 年前
There are some interesting technologies in the pipeline for truly long-term data storage. Synthetic diamond is one option (light-sensitive, so perhaps susceptible to cosmic-ray degradation over time):<p><a href="https:&#x2F;&#x2F;theconversation.com&#x2F;turning-diamonds-defects-into-long-term-3-d-data-storage-67685" rel="nofollow">https:&#x2F;&#x2F;theconversation.com&#x2F;turning-diamonds-defects-into-lo...</a><p>Another is microetching, i.e. ion-beam insertion of foreign atoms into crystalline materials, such as diamond or nickel, although the data density is lower than the above approach, it seems a lot less sensitive (i.e. light should have less effect):<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;HD-Rosetta" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;HD-Rosetta</a>
tedivm超过 2 年前
The timing on this for me is really interesting, as last week I got an ISBN issues for a book I&#x27;m working on (9781633438002 if anyone is curious!).<p>This will be the first book I&#x27;m the author of, but the second book I&#x27;ve worked on (the first I was the technical editor for). Neither of these books are out yet (I start writing tomorrow) but they both have ISBNs issued. Even if I never publish the book that ISBN is locked in.<p>I imagine there&#x27;s a lot of books that started out but never got finished. That said it looks like ISBNdb doesn&#x27;t grab directly from the source, but instead crawls the internet looking for ISBN data to put into its database. I&#x27;ll be interested to see at which stage my ISBN shows up in the database.
评论 #33414088 未加载
omoikane超过 2 年前
That statement of &quot;before the demise of Google Books&quot; seems unnecessary. The next quoted bit of &quot;at least until Sunday&quot; might have been an attempt to complete the joke, but should be interpreted as the number of books changing rapidly according to the (12 year old) linked article.<p><a href="http:&#x2F;&#x2F;booksearch.blogspot.com&#x2F;2010&#x2F;08&#x2F;books-of-world-stand-up-and-be-counted.html" rel="nofollow">http:&#x2F;&#x2F;booksearch.blogspot.com&#x2F;2010&#x2F;08&#x2F;books-of-world-stand-...</a>
ZeroGravitas超过 2 年前
&gt; extracting ISBNs from the actual book scans themselves (in the case of Z-Library&#x2F;Libgen).<p>OpenLibrary also uses book scans in Archive.org to extract ISBNs (and a few other bits of metadata, like urls in the text):<p><a href="https:&#x2F;&#x2F;blog.openlibrary.org&#x2F;2021&#x2F;08&#x2F;23&#x2F;gsoc-2021-making-books-lendable&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.openlibrary.org&#x2F;2021&#x2F;08&#x2F;23&#x2F;gsoc-2021-making-boo...</a><p>And have a software pipeline for that kind of thing available.
评论 #33413947 未加载
mechanical_bear超过 2 年前
Forever? 0.