TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: OCR from screenshot returns gibberish

3 点作者 raleigh_user超过 6 年前
My brother works at a restaurant and his manager sends him screenshots of the schedule via email.<p>I would like to write a simple OCR app that does the following:<p>-gets the screenshot from his gmail -finds his name on the schedule -adds the hours from the schedule to his google calendar<p>This is a fun weekend project. Thinking about building it in a new language I haven&#x27;t used before.<p>However, when running the screenshot through the OCR stuff I can find online (before actually writing code) the results are absolutely horrible.<p>Am I doing this wrong or is OCR just not very good?

3 条评论

janci超过 6 年前
You may need to prepare the input - isolate the parts you want read, blank out all other, remove the background color, table borders, graphic elements. Convert to greyscale&#x2F;BW. Then apply OCR
ColinWright超过 6 年前
I&#x27;ve generally found that OCR requires high resolution and&#x2F;or image pre-filtering. With significant pre-filtering I&#x27;ve had some great results.<p>Tesseract can be very, very good, but also very, very bad. I&#x27;d suggest you have a quick hack at writing your own overly simplistic OCR tool and see how well you get on. This will either give you an appreciation of the difficulties and potentially how to do the pre-processing to overcome them, or you will have a tool that is better than the existing ones, and people will love you for it.
ohiovr超过 6 年前
I’m not an expert but have you tried tesseract ocr?
评论 #18840422 未加载