TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Tesseract.js – Pure JavaScript OCR for 60 Languages

727 点作者 bijection超过 8 年前

28 条评论

xigency超过 8 年前
To anyone screen capturing small fonts as a demonstration, or capturing digital text especially at a small resolution, I don&#x27;t believe that that is the purpose of this OCR library. (As a specialized problem, that might be easier to solve depending on the typeface.)<p>A much better example that works quite well is a picture of someone holding a book: <a href="http:&#x2F;&#x2F;i.imgur.com&#x2F;3JWs64x.jpg" rel="nofollow">http:&#x2F;&#x2F;i.imgur.com&#x2F;3JWs64x.jpg</a><p><pre><code> Magic . Read this to yourself. Read it silently Don&#x27;t move your lips. Don’t make a suund Listen to yourself. Listen without hearing What a wonderfully weird thing, huh? NOW MAKE THIS PART LOUD! SCREAM IT IN YOUR MIND! DROWN EVERYTHING OUT. Now, hear a whisper. A tiny whisper. New, read this next line with your best crotchety— old-man voice: “Hello there, sonny. Does your town have apost 0 Awesome! Who was that? Whose voice was that? It sure wasn’t yours! How do you do that? How?! Must be magic. </code></pre> Problems with this text: misspelled &#x27;sound&#x27; as &#x27;suund&#x27;, didn&#x27;t recognize the word &#x27;anything&#x27;, and mis-recognized &#x27;a post office&#x27; as &#x27;apost 0&#x27;.<p>Not bad. Especially since two of three mistakes are on the edge of the page.
评论 #12695866 未加载
评论 #12697384 未加载
评论 #12698742 未加载
pyronite超过 8 年前
The text detection is lacking in comparison to Google&#x27;s Vision API. Here is a real-life comparison between Tesseract and Google&#x27;s Vision API, based on a PDF a user of our website uploaded.<p>Original text [<a href="http:&#x2F;&#x2F;i.imgur.com&#x2F;CZGhKhn.png" rel="nofollow">http:&#x2F;&#x2F;i.imgur.com&#x2F;CZGhKhn.png</a>]:<p>&gt; I am also a top professional on Thumbtack which is a site for people looking for professional services like on gig salad. Please see my reviews from my clients there as well<p>Google detects [<a href="http:&#x2F;&#x2F;i.imgur.com&#x2F;pSJym1x.png" rel="nofollow">http:&#x2F;&#x2F;i.imgur.com&#x2F;pSJym1x.png</a>]:<p>&gt; “ I am also a top professional on Thumbtack which is a site for people looking for professional services like on gig salad. Please see my reviews from my clients there as well ”<p>Tesseract detects [<a href="http:&#x2F;&#x2F;i.imgur.com&#x2F;wwbLU6g.png" rel="nofollow">http:&#x2F;&#x2F;i.imgur.com&#x2F;wwbLU6g.png</a>]:<p>&gt; \ am also a mp pmfesslonzl on Thummack wmcn Is a sue 1m peop‘e \ookmg (or professmna‘ semces We on glg salad P‘ezse see my rewews 1mm my cuems were as weH
评论 #12694592 未加载
评论 #12695042 未加载
评论 #12695202 未加载
评论 #12695183 未加载
评论 #12697270 未加载
评论 #12694600 未加载
iplaw超过 8 年前
HOW is there not a better, almost 100% accurate OCR tool?<p>I routinely (daily) need to OCR PDF files. The PDF files are not scans. They are PDF files created from a Word file. The text is 100% clear, the lines are 100% straight, and the type is 100% uniform.<p>And, yet, Microsoft and Google OCR spits out gibberish that is full of critical errors.<p>From a problem solving perspective, this seems like an incredibly easy problem to solve in this exact use case. That is, PDFs generated from text files. Identify a uniform font size (prevent o-to-O and o-to-0 errors), identify a font-family (serif, sans-serif, narrow to particular fonts), and OCR the damn thing. And yet, the output is useless in my field.
评论 #12697101 未加载
评论 #12696398 未加载
评论 #12696720 未加载
评论 #12697626 未加载
评论 #12696665 未加载
jameslk超过 8 年前
For all those claiming issues with reading text from a screen shot of this page, note that this is more an issue with the original Tesseract library, not this library (which appears to wrap Tesseract compiled through Emscripten). I remember having a similar issue when I used the original Tesseract. The quick hack I found to fix it was to rescale any small text input images 3x first before feeding it to Tesseract. I&#x27;m sure there&#x27;s more intelligent solutions to mitigate that problem.
评论 #12696189 未加载
AgentME超过 8 年前
Why the promise-<i>like</i> interface? If it returned a promise with a this-returning progress method monkey-patched onto it, then you could use it otherwise like a regular promise:<p><pre><code> Tesseract.recognize(myImage) .progress(function(message){console.log(message)}) .then(function(result){console.log(result)}) .catch(function(err){console.error(err)}); </code></pre> or<p><pre><code> Tesseract.recognize(myImage) .progress(function(message){console.log(message)}) .then( function(result){console.log(result)}, function(err){console.error(err)} ); </code></pre> I guess I just still have bad memories of jQuery&#x27;s old almost-like-real promises. I&#x27;d rather never have to think ever again about whether I&#x27;m dealing with a real promise or one that&#x27;s going to surprise me and break at run-time because I tried to use it like a real one.
评论 #12695611 未加载
greenpizza13超过 8 年前
Excited about this... but the OCR quality seems to be very bad. Maybe it&#x27;s not optimized for recognizing black text on a white background.<p>For example, I took a screenshot of this comment and ran it through the demo and got this:<p>Excited ehent this... but the OCR enenty Seems te be very bad. Maybe it&#x27;s het Dptimized far recngnizing black text an e white heckgmnhe. EDI example, 1 tank e Screenshnt at this cement ehe teh it. thmneh the den» ehd get this:<p>It seems to recognize the bounding boxes just fine but mangles the words.
评论 #12694549 未加载
goatslacker超过 8 年前
I&#x27;ve been using this library to read screenshots of Pokemon Go to automatically calculate Individual Values for each Pokemon[1] It&#x27;s worked great on desktop, but on mobile safari where it matters most the library causes the browser to crash :(<p>1: <a href="https:&#x2F;&#x2F;github.com&#x2F;goatslacker&#x2F;pokemon-go-iv-calculator&#x2F;blob&#x2F;master&#x2F;web&#x2F;components&#x2F;PictureUpload.js" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;goatslacker&#x2F;pokemon-go-iv-calculator&#x2F;blob...</a>
评论 #12700375 未加载
userbinator超过 8 年前
Tesseract was one of the best publicly-available CAPTCHA solvers when I was playing around with that stuff a few years ago; I remember somewhere in the neighbourhood of 90%+ accuracy on ReCAPTCHA, no wonder they&#x27;ve changed those considerably since then to make it difficult even for humans.
gentleteblor超过 8 年前
I&#x27;ve always wanted to use Tesseract on .NET projects but it was always clumsy (wrappers). This looks like it&#x27;ll make things easier.<p>Thanks for putting this out.
评论 #12696699 未加载
yankyou超过 8 年前
&gt; Drop an English image on this page to OCR it!<p>This looks great, and I&#x27;d really love to but<p>&gt; Uncaught ReferenceError: progress is not defined<p>EDIT: works now!
评论 #12694165 未加载
mdani超过 8 年前
Languages list link is broken - getting 404 for the following <a href="https:&#x2F;&#x2F;github.com&#x2F;naptha&#x2F;tesseract.js&#x2F;blob&#x2F;master&#x2F;tesseract_lang_list.md" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;naptha&#x2F;tesseract.js&#x2F;blob&#x2F;master&#x2F;tesseract...</a>
评论 #12694286 未加载
zelon88超过 8 年前
Does this mean I can implement Tesseract on my home server without using php&#x27;s shell_exec to perform magic on my files? I can just use Jscript instead? Cool!<p>My current HRCloud2 project could benefit greatly if I ever get around to it. Currently I make the php interpreter jump through hoops and move stuff all over the place to OCR images and docs. This could save a ton of time and shift the processing to the client instead of my server.
评论 #12698130 未加载
KiwiCoder超过 8 年前
Impressive that this is pure JS, however trying an image cut from the page itself gave this result<p>&gt; Dropan Enghsh Wage on (Ms page to OCR m<p>Should be<p>&gt; Drop an English image on this page to OCR it!
评论 #12701115 未加载
评论 #12694447 未加载
daliwali超过 8 年前
The title and description are very misleading: this is technically &quot;pure JavaScript&quot; but the JS is compiled from the original C++ library of the same name using emscripten. I think &quot;pure JS&quot; would imply that all of its sources are written in JS which is not the case here. It&#x27;s mostly the C++ code doing the actual work, with a little JS wrapper on top.
slajax超过 8 年前
Pretty cool. I screen captured the text in the bottom right corner of the page and it had some issues. Here&#x27;s a screenshot of what happened: <a href="http:&#x2F;&#x2F;io.kc.io&#x2F;hkeM" rel="nofollow">http:&#x2F;&#x2F;io.kc.io&#x2F;hkeM</a>
mgalka超过 8 年前
Awesome! The ability to OCR video in a browser opens up so many interesting possibilities.
jaytaylor超过 8 年前
For those who may be interested;<p>I threw together a quick proof-of-concept in Go for exposing tesseract via a web API:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;jaytaylor&#x2F;tesseract-web" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;jaytaylor&#x2F;tesseract-web</a>
zhte415超过 8 年前
Does this include taking a text and for example, when viewing it, &#x27;wiping&#x27; the text in the logical native language order?<p>For languages that don&#x27;t employ much whitespace, this would be nice.
artf超过 8 年前
Sorry guys, probably a stupid question (googled quickly, doesn&#x27;t worked), but does this kind of stuff involve ML? Do I need to train it?
评论 #12699204 未加载
maaaats超过 8 年前
Does it block while it works and do the work in several setTimeouts or how do they get it to report progress without freezing everything?
评论 #12695971 未加载
codemode超过 8 年前
Is it true, that original implementation of tesseract exexuted from commandline is faster than javascript translated version?
评论 #12710194 未加载
ckluis超过 8 年前
What License? Doesn&#x27;t mention it.
评论 #12694875 未加载
评论 #12694309 未加载
mrcactu5超过 8 年前
Tesseract is not specific to JavaScript right? I do recall there being a version for Python
评论 #12695166 未加载
z3t4超过 8 年前
More instructions, like how to train it, would be nice.
niutech超过 8 年前
How does it compare with Ocrad.js?
newtons_bodkin超过 8 年前
How long did this take to build?
sanketbajoria超过 8 年前
Awesome
employee8000超过 8 年前
Is this at all affiliated with the already-existing tesseract OCR library? It doesn&#x27;t seem to be from my cursory check so if not you need to rename your library, because you&#x27;re ripping off their name.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;tesseract-ocr&#x2F;tesseract" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;tesseract-ocr&#x2F;tesseract</a>
评论 #12694294 未加载
评论 #12694291 未加载