Been there, done that. Put that car at an angle and see how well that works. Or maybe add rain. Or bright sunshine. Or make it dark.<p>Spoiler alert: these kinds of image processing Rube Goldberg machines fall apart real quick with real world conditions.<p>> The intent of this article is to sensibilize the reader that a machine learning approach in not always the best or first solution to solve some object detection problems.<p>The danger of this article is people end up wasting time & budget with brittle approaches.<p>(Though to be fair the danger of the deep learning/ML route is people can end up wasting just as much time when they don’t have an adequate amount of training data.)
I successfully designed a commercial LPR system using non-ML methods that handles all those kind of weird real-world cases you get that are never mentioned in papers about LPR and OCR.<p>Check out some of the hard cases here: <a href="https://waysight.com/lpr-technology/#examples" rel="nofollow">https://waysight.com/lpr-technology/#examples</a><p>I couldn't use neural networks at the time, because 1) it had to run in a single core 200 MHz ARM at 20 fps, and 2) it took too long to "debug" the NNs (often you want to improve performance and the failure cases seem fine to you)<p>So the resulting performance of this system was and is state of the art in that in practice it captures everything correctly, so there is no point wasting cycles on an NN solution.<p>On the other hand, starting from scratch, an NN solution can allow you to advance quickly with little domain knowledge.<p>I would like to repeat a word of caution other posters mentioned as well - it takes years to collect the training data required for successful commercial LPR, both for NN and non-NN methods (as you need regression tests even if you don't need it for training), to get plates from all seasons and weathers and vehicle conditions.<p>Interestingly, now almost 10 years later, when I look at more modern deep NNs and analysing their initial stages with modern tools I see a lot of similarity to my original "standard" algorithms. In essence, the code I wrote did the same thing a modern convolutional deep NN would learn to do.<p>In particular, see this recent analysis of how (some) convnets learn:<p><a href="https://www.lyrn.ai/2019/02/14/bagnet-imagenet-with-a-simple-bof-model/" rel="nofollow">https://www.lyrn.ai/2019/02/14/bagnet-imagenet-with-a-simple...</a><p>Which, as it turns out, is really similar to how 80's and 90's style classic OCR algorithms work!
> <i>The intent of this article is to sensibilize [sic] the reader that a machine learning approach in not always the best or first solution to solve some object detection problems</i><p>This is a good lesson to teach people! But I don't think that this page does so successfully. By far and away the biggest issue is that they are <i></i>only demonstrating on a single image<i></i>. You can get almost any method to work on a single image, especially when you're interactively designing the method to work on that specific image. If you want a method that works in unforeseen conditions, you need to think carefully about your assumptions...<p>* Threshold on 0.5–will that work at night? What if you're driving into the sun and the image is overexposed?<p>* Dilating canny edges and finding connected components–what if the car is white with black trim, and the dilation merges the license plate CC with another one? What if the car is further away, and the dilation factor merges the entire car into one big blob?<p>* Filtering the blobs by ratio–what if the car is at an angle (turning, or starting up a hill), so the ratio is off? What if they have bumper stickers that are the same shape, and fool the filter?<p>While you can make a reasonable license plate detector with traditional methods (see, for example, <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.455.2440&rep=rep1&type=pdf" rel="nofollow">http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.455...</a>), the advantage of using ML methods is that, if done correctly, they actually have a chance of working in most the situations you encounter (including the assumption-violating ones above), rather than just the ones you can think of while making the system. That lesson is probably just as important as the "don't jump straight to ML" one.
These sorts of posts are just so ridiculously off base.<p>People are absolutely desperate to show that ML isn’t the answer to everything, which nobody in ML claims it is. But it is the very clear answer to some things, and problems like license plate detection fall so unambiguously into that category that, quite frankly, if you think otherwise you’ve really disqualified yourself from the set of people who ought to be asked about these things.<p>In my team, we don’t use deep learning for every problem. We don’t even use machine learning for every problem. I’m a Mathematician, and frankly I’m just not that immature. The fact is, ML is a great solution to a lot of problems, and Deep Learning is a particularly good solution to an ever-increasing category of problems.<p>If you can’t tell the difference between such cases, then you should probably not conclude that everyone is doing it wrong, and instead conclude that you have more to learn.
I used OpenCV about 15 years back to read football pools tickets. It eventually worked but my (admittedly atrocious) code was slow and clunky: a soup of mixed C and C++ whose sources I luckily lost:)
What made the process even slower with then available hardware was the necessary deskewing if one put the ticket not perfectly aligned with the camera, which is almost always the case.
Not having worked with OpenCV since then, my question is that todays newer versions and available low cost hardware (namely *PI like boards) could allow a similar but more complex process in almost realtime and reading whole digits instead of bullets. That is, an user puts a piece of paper under the camera, the software recognizes what it is by looking at common patterns it has with other similar documents, giving it proper orientation when needed, then it reads text and numeric information applying OCR code, then according to that information, fills/updates one or more fields in this or that database.
Doable? Caveats?
I’ve tried many of the same techniques myself for doing OCR. It’s a very brittle pipeline. You’ll need ML to do the job well. The state-of-the-art is Jaderberg’s Text Spotting in the Wild.
I worked on a project on Optical inspection of PCBs on assembly lines in early 2004 and it was all image processing. Not a peep about ML in those days.
Having a signpost with the same aspect ratio, let's forget the failure rates for skewed angles & lighting issues will also make this fail.
ML approach which takes context is the way to do this but it will only improve in the future with performance.<p>This is almost sounding like ASM vs Python.
It's essentially an approach based on heuristics. It can be brittle and cannot deal with ambiguity well, and that's where ML shines. It's not uncommon that people combine heuristics with ML in real world, since rule-based models are fast and cheap.
This approach fails e.g. The morphological operations you need for an infrared night time image with washed out white is different than what you need to detect the plate in the day time etc.