TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Extract text from any pdf in the browser

12 点作者 blaydator大约 5 年前

1 comment

blaydator大约 5 年前
Hi Hackers,<p>Often I get pdfs which I want to extract text from and paste it somewhere else. Not all PDFs are always well constructed and a lot of them are scanned ones. Unfortunately Mac&#x27;s Preview or other classic PDF viewers can not extract text from those.<p>So I have built a minimalist website to extract text from any PDFs, scanned ones as well. It uses OCR to extract text and the user can highlight specific areas on the document to extract from. The extraction is made locally by the browser thanks to the awesome Tesseract.js library.<p>I would love to have your feedback before adding more features (zoom setting, improve areas selections, png&#x2F;jpeg support, mobile support, offline support, ...).
评论 #22967475 未加载
评论 #22976132 未加载
评论 #22967441 未加载