TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Neural network optical character recognition

23 pointsby megalodonabout 10 years ago

3 comments

bradneubergabout 10 years ago
Are the reported success metrics on the training or testing set? The website says its on the training set, which shouldn&#x27;t be a valid metric of success since neural networks can easily overfit to their training data (one of their downsides if you aren&#x27;t careful).<p>Having the output layer be an 8-bit character representation though is very clever, rather than a softmax layer with each node being the relative probability of a given character. That probably lowers the number of free parameters you have to train, which probably speeds up training and can help prevent overfitting. I&#x27;m interested in knowing what the true success rate is with this approach as it seems clever.<p>Btw, what&#x27;s your loss function on the output layer?
评论 #9208100 未加载
frikabout 10 years ago
A training set like Google&#x27;s <i>Recaptcha</i> data would be useful. Maybe Project Gutenberg, Wikipedia and other open source projects should start an open Recaptcha-like service to collect such data based on scanned documents&#x2F;books&#x2F;etc.
评论 #9207154 未加载
singularity2001about 10 years ago
Shameless plug: Similar thing for GPU <a href="https://github.com/pannous/caffe-ocr" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;pannous&#x2F;caffe-ocr</a>