TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: OCR Workbench: AI OCR for hard documents

2 点作者 viking29171 天前
OCR on old documents is hard. OCR Workbench uses AI for OCR and provides an editing environment to clean things up, as is inevitably required.<p>Inspired by this Hacker News post: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43048698">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43048698</a><p>Backstory: I was having trouble producing transcriptions of Colonial American documents, which have their own unique challenges for OCR, and things like Tesseract fail miserably. So I built something. Uses Gemini and seems to work pretty well (disclaimer: you need your own API key). I didn&#x27;t build Claude but I expect it works similarly well.<p>FWIW: largely vibe coded, with human review and intervention as required.

1 comment

keepsweet大约 14 小时前
Interesting concept. I tried it with a text written in Church Slavonic, didn&#x27;t work. I guess the documents don&#x27;t have to be THAT old. It would also be nice if you could upload images individually instead of selecting everything from a folder. Either way, nice work.
评论 #43988722 未加载
评论 #43988047 未加载