Great idea. Someone please build a public database of tables of contents and indexes, keyed by ISBN, so we can skip the scanning and OCR steps. And dear publishers: when this database emerges please add to it as you publish new works.<p>Thanks.
Does anyone have any experience with book scanning in general? I've been eyeballing a unit from "CZUR" but am a bit skeptical of the product in general. I would prefer to buy something more generic/high end HW wise like a V-shaped scanner where you bring your own DSLRs but can't find out if there is a serious open source software platform for them.
Offtopic: huydotnet, would you consider adding RSS to your blog? I'd like to subscribe, but couldn't find a link anywhere, including the source code of the page.<p>It seems like you're generating pages from Org mode. I've recently discovered ox-hugo, maybe it'll be of interest to you too. I wrote about my setup here <a href="https://rakhim.org/2018/09/moved-from-jekyll-to-hugo-and-ox-hugo/" rel="nofollow">https://rakhim.org/2018/09/moved-from-jekyll-to-hugo-and-ox-...</a>
CamScanner[1] on Android does a very nice job of this kind of work. I'm not associated with the product, just a longtime satisfied user.<p>[1] <a href="https://www.camscanner.com/" rel="nofollow">https://www.camscanner.com/</a>
I find it cool that this blog post was originally sketched by hand: <a href="https://huytd.github.io/img/handwritten-build-a-better-bookshelf.jpg" rel="nofollow">https://huytd.github.io/img/handwritten-build-a-better-books...</a>
Are there multiple versions of the current OneNote? I can't OCR anything in my version for Windows 10. I'm stuck like this guy - <a href="https://answers.microsoft.com/en-us/msoffice/forum/msoffice_onenote-mso_win10-msoversion_other/onenote-ocr-not-working/e239c9ea-e1b3-46c3-a976-45a418231dbc" rel="nofollow">https://answers.microsoft.com/en-us/msoffice/forum/msoffice_...</a>
This reminds me of Bret Victor’s Bookcase which displays the sections “highlighted” by projecting them on a wall and navigating to that page on an iPad.
The fact that we have to scan any technical book published after 1980 is a distortion of capitalism. The obvious most efficient solution for everyone would be to have the darn fully searchable digital version of the book + the source code.
I’d rather scan the whole book and replace my library with an iPad. In fact, I’ve already done that. With over 1,000 books on my iPad it’s the only reason I’m still married.