TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: OCR from screenshot returns gibberish

3 pointsby raleigh_userover 6 years ago
My brother works at a restaurant and his manager sends him screenshots of the schedule via email.<p>I would like to write a simple OCR app that does the following:<p>-gets the screenshot from his gmail -finds his name on the schedule -adds the hours from the schedule to his google calendar<p>This is a fun weekend project. Thinking about building it in a new language I haven&#x27;t used before.<p>However, when running the screenshot through the OCR stuff I can find online (before actually writing code) the results are absolutely horrible.<p>Am I doing this wrong or is OCR just not very good?

3 comments

janciover 6 years ago
You may need to prepare the input - isolate the parts you want read, blank out all other, remove the background color, table borders, graphic elements. Convert to greyscale&#x2F;BW. Then apply OCR
ColinWrightover 6 years ago
I&#x27;ve generally found that OCR requires high resolution and&#x2F;or image pre-filtering. With significant pre-filtering I&#x27;ve had some great results.<p>Tesseract can be very, very good, but also very, very bad. I&#x27;d suggest you have a quick hack at writing your own overly simplistic OCR tool and see how well you get on. This will either give you an appreciation of the difficulties and potentially how to do the pre-processing to overcome them, or you will have a tool that is better than the existing ones, and people will love you for it.
ohiovrover 6 years ago
I’m not an expert but have you tried tesseract ocr?
评论 #18840422 未加载