It's a fine balance between Not Invented Here syndrome vs. trying to hammer the square peg of off-the-shelf OSS into the round hole of the actual problem you're trying to solve.<p>For example they suggest ROS as a robust industry-ready software, which absolutely hasn't been my experience: you hire a bunch of domain experts to solve the various [hardware, controls, perception, imaging, systems] problems, but once you use ROS as your middleware you end up needing a bunch of ROS experts instead. This is due to the horrible build system, odd choice of defaults, instability under constrained resources, and how it inserts itself into everything. You end up needing more fine-grained control than ROS gives you to make an actually robust system, but by the time you discover this you'll be so invested into ROS that switching away will involve a full rewrite.<p>The same goes for further downstream: OpenCV images are basically a void* with a bunch of helper functions. (4.x tried to help with this but got sideswiped by DNN before anything concrete could happen.)<p>I guess it's the same rant the FreeBSD people have about the Linux ecosystem and its reliability. However I'd hope we raise our standards when it comes to mobile robotics that have the potential to accidentally seriously hurt people. And who knows, maybe one day OpenCV and ROS will pleasantly surprise me the way Linux has with its progress.
This article struck a personal note with me because around the same time (2008-2012) I was really getting into vision, and even got published as an undergrad for imaging sensor fusion work (...my first, only, and likely last only meaningful contribution to my species); while the wider MV/CV community was making incremental gains every few years (anyone else remember Histogram-of-Oriented-Gradients?), that's what they were: incremental (I also remember my research-supervisor recounting how the patent on SIFT probably held back the entire field by a decade or two, so yes - things were slow-moving...<p>...until a few years ago when:<p>> Computer vision has been consumed by AI.<p>...but "AI" is an unsatisfying reduction. What does it even mean? (and c'mon, plenty of non-NN CV techniques going back decades can be called "AI" today with a straight-face (for example, an adaptive pixel+contour histogram model for classifying very specific things).<p>My point is that computer-vision, as a field, *is* (an) artificial-intelligence: it has not been "consumed by AI". I don't want ephemeral fad terminology (y'know... buzzwords) getting in the way of what could have been a much better article.