TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Writing a fuzzy receipt parser in Python

133 点作者 andygrunwald超过 9 年前

8 条评论

bariumbitmap超过 9 年前
It&#x27;s a shame that receipts don&#x27;t have machine readable output.<p>QR codes can hold a little over 1,200 characters, which should be more than enough for most receipts.<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;QR_code#Storage" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;QR_code#Storage</a><p>Edit: related link: <a href="https:&#x2F;&#x2F;www.quora.com&#x2F;Can-and-how-cash-register-receipts-be-transformed-to-a-QR-code-and-scanned-with-a-smartphone-app-so-it-can-become-digital?share=1" rel="nofollow">https:&#x2F;&#x2F;www.quora.com&#x2F;Can-and-how-cash-register-receipts-be-...</a>
评论 #10339716 未加载
评论 #10341010 未加载
评论 #10339588 未加载
评论 #10341101 未加载
评论 #10339511 未加载
laito超过 9 年前
Hey, this is pretty cool. I actually tried something similar. (Keeping a list of shop names and matching it with tesseract&#x27;s results) I was trying hough transform for slight image rotations. I wasn&#x27;t aware of imagemagick&#x27;s textcleaner script. That could have save me a lot of trouble :) I got roadblocked by the problem of having various kinds of receipts with absolutely no layout in common. I figured it would need a lot of training for the system to have a decent accuracy and left it for another day.
评论 #10339536 未加载
omn1超过 9 年前
Hey, author here. I am happy for all questions or every kind of feedback.
评论 #10341593 未加载
评论 #10338630 未加载
评论 #10338585 未加载
评论 #10340385 未加载
评论 #10338586 未加载
评论 #10338588 未加载
pbnjay超过 9 年前
For the next step, and easier name matching... why not export a CSV of your online banking and use names and totals to match? Or are these cash receipts?
评论 #10342509 未加载
评论 #10339498 未加载
评论 #10342513 未加载
joshribakoff超过 9 年前
I&#x27;ve considered an app that would do this in the past. It would be like mint.com which automatically tracks your finances via online banking, but instead of showing you spent $100 at the supermarket, it would show that you spent $20 on beer, $50 on cash back, and $30 on food... allowing better insights into your finances &amp; where to cut back to save money.
misnome超过 9 年前
I&#x27;ve been thinking about something vaguely similar for paperwork processing. It&#x27;d be nice to pull company name from recognising the layout&#x2F;logo, and an attempt at reading the date out of the page.<p>Anyone know any resources or an idea for direction to get started on this?
评论 #10341462 未加载
t_g超过 9 年前
If you are genuinely interested in this sort of thing, I&#x27;d like to think we do a pretty good job at receipt parsing.<p><a href="http:&#x2F;&#x2F;www.neat.com&#x2F;" rel="nofollow">http:&#x2F;&#x2F;www.neat.com&#x2F;</a><p>Disclaimer: I work for the company.
评论 #10342944 未加载
comrh超过 9 年前
I think I would have more problems saving all the receipts using this workflow. Just logging them into YNAB&#x27;s mobile app is great for me.
评论 #10339375 未加载