TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: OCR Workbench: AI OCR for hard documents

2 pointsby viking291711 days ago
OCR on old documents is hard. OCR Workbench uses AI for OCR and provides an editing environment to clean things up, as is inevitably required.<p>Inspired by this Hacker News post: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43048698">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43048698</a><p>Backstory: I was having trouble producing transcriptions of Colonial American documents, which have their own unique challenges for OCR, and things like Tesseract fail miserably. So I built something. Uses Gemini and seems to work pretty well (disclaimer: you need your own API key). I didn&#x27;t build Claude but I expect it works similarly well.<p>FWIW: largely vibe coded, with human review and intervention as required.

1 comment

keepsweet10 days ago
Interesting concept. I tried it with a text written in Church Slavonic, didn&#x27;t work. I guess the documents don&#x27;t have to be THAT old. It would also be nice if you could upload images individually instead of selecting everything from a folder. Either way, nice work.
评论 #43988047 未加载
评论 #43988722 未加载