TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: What's the most performant/practical model/API for text extraction?

1 点作者 sandkoan超过 2 年前
AWS Textract might be more accurate, for instance, but is also not nearly as cost effective as spinning up an EC2 instance with Tesseract or easyOCR or PaddleOCR.<p>Or is it more sensible on an accuracy-vs-cost standpoint to just run a transformers model like TrOCR after identifying bounding boxes with textual data with something like CRAFT or EAST?

1 comment

sargstuff超过 2 年前
Depends on different factors &amp; critera.<p>short example list:<p>* fixed font character text on blank background; human hand writing set against busy city street background<p>* converting non-text font image to text description. (collage of images forming illusion of text font)