TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: LLM Enhanced OCR

5 点作者 dayeye2006大约 2 年前
Has anyone tried experimenting with using LLM to enhance the results OCR. OCR software may produce results that are full of noises (nonsence chars). It's very hard to pattern matching the generated results since the noises are high unpredictable. Does LLM help to "de-noise" the results since they tend to take in char level information and might recognize what are useless information?

3 条评论

Agraillo大约 2 年前
I&#x27;m sure the LLM-based engines will shine here, partly they are already here. A couple observations: - Google Lens, now by default activated when you post an image to Google Images (<a href="https:&#x2F;&#x2F;images.google.com" rel="nofollow">https:&#x2F;&#x2F;images.google.com</a>) has a text recognition feature and it is very impressive even if you give it an image with a screen dpi and grammatically incoherent text (dictionary entries with short phrases and abbreviations) - I played with different LLM-based chats with the following queries &quot;Please reconstruct the original text from the following corrupted one: Smng rng wt ths ly&quot;. The test is similar to an OCR task when not all letters are recognizable or printed clearly. Perplexity for example answered with hesitation, but mostly correct (Something like: &quot;I can not answer definitely, but related is &quot;Something wrong with this reply&quot;)
phren0logy大约 2 年前
I have been wondering the same thing. So many OCR engines spit out results that are obviously wrong, and I don&#x27;t want them to get too clever but a little but of smarts would go a long way.
speedgoose大约 2 年前
You should try. Tell ChatGPT to fix the following text that has OCR mistakes and it should work.