TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Fine-tuning OCR works really well: Statistical Abstracts of the United States

2 pointsby c_moscardi8 months ago

1 comment

c_moscardi8 months ago
Hi HN! I&#x27;ve spent a couple of months fiddling with OCR and wanted to share some of my findings.<p>The approach I share here (fine-tuning recent deep learning models) is the first one that&#x27;s gotten me anything resembling high-quality OCR on these particular noisy historical documents. OCRing these has been something of a white whale for me for several years (except, a white whale that I have spent comparatively little time on).<p>At this point I think I am reasonably competent in OCR, but no expert... Curious for your thoughts.