It's a shame that receipts don't have machine readable output.<p>QR codes can hold a little over 1,200 characters, which should be more than enough for most receipts.<p><a href="https://en.wikipedia.org/wiki/QR_code#Storage" rel="nofollow">https://en.wikipedia.org/wiki/QR_code#Storage</a><p>Edit: related link: <a href="https://www.quora.com/Can-and-how-cash-register-receipts-be-transformed-to-a-QR-code-and-scanned-with-a-smartphone-app-so-it-can-become-digital?share=1" rel="nofollow">https://www.quora.com/Can-and-how-cash-register-receipts-be-...</a>
Hey, this is pretty cool. I actually tried something similar. (Keeping a list of shop names and matching it with tesseract's results)
I was trying hough transform for slight image rotations. I wasn't aware of imagemagick's textcleaner script. That could have save me a lot of trouble :)
I got roadblocked by the problem of having various kinds of receipts with absolutely no layout in common. I figured it would need a lot of training for the system to have a decent accuracy and left it for another day.
For the next step, and easier name matching... why not export a CSV of your online banking and use names and totals to match? Or are these cash receipts?
I've considered an app that would do this in the past. It would be like mint.com which automatically tracks your finances via online banking, but instead of showing you spent $100 at the supermarket, it would show that you spent $20 on beer, $50 on cash back, and $30 on food... allowing better insights into your finances & where to cut back to save money.
I've been thinking about something vaguely similar for paperwork processing. It'd be nice to pull company name from recognising the layout/logo, and an attempt at reading the date out of the page.<p>Anyone know any resources or an idea for direction to get started on this?
If you are genuinely interested in this sort of thing, I'd like to think we do a pretty good job at receipt parsing.<p><a href="http://www.neat.com/" rel="nofollow">http://www.neat.com/</a><p>Disclaimer: I work for the company.