TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Recommended OCR?

9 pointsby shinyover 14 years ago
I'm OCRing a bunch of TIFF files with tesseract, and while it works to some degree, it's nowhere near as accurate as I'd like it to be. Perhaps I'm doing something wrong and I could tune it to my liking, but I can't find too many resources on tesseract. Am I missing something?<p>Any other recommendations for OCRs? Ideally it would be free, but I'm willing to pay if it's not too pricey.<p>I've been trying out the trial version of FineReader, and it seems to work pretty well, so I may go with that.<p>Any help is greatly appreciated.

5 comments

MaxGabrielover 14 years ago
I've had really great success with finereader. I tried out every free OCR tool I could find and after poor results went for finereader.<p>Spend some time on their website so you get the right product, they have multiple prices for the same products, too. I got the latest Finereader (after a coupon code I found on google) for between 130-150.<p>(I'm mostly scanning books)
ig1over 14 years ago
Finereader is what Project Gutenberg has been using for the last decade or so.
hebz0rlover 14 years ago
what about gocr? its opensource see <a href="http://jocr.sourceforge.net/" rel="nofollow">http://jocr.sourceforge.net/</a>
评论 #2127097 未加载
usermacover 14 years ago
Fujitsu ScanSnap
hijimayorover 14 years ago
One thing that improves Tesseract's performance dramatically is giving it grayscale tif images. Do<p>mogrify -type Grayscale *.tif<p>and run them through tesseract to see the difference. No idea why no one mentions this in the documentation.
评论 #2130732 未加载