I'm running a Computer Vision consulting company quite successfully. Basically, my job is to implement "where's Waldo" IRL for my customers (B2B).<p>However, I'm surprised at how it seems difficult to focus on ONE specific kind of customer: most of my customers come from different backgrounds, the stakeholders are quite different... making the sales process sometimes slow and complex, with a high CAC.
In my path to have a streamlined (read: move from service to a more general product), I struggle with finding common traits to the Computer Vision needs.<p>Some customers have a programmer background, some do not.
Some customers come from marketing, others from manufacturing/operations, others run SMBs.<p>The do-all-do-nothing approach of platforms like Sagemaker or Azure Custom Vision overwhelm them. These platforms usually address people who:
- already have a data science background
- integrate into a broader and generally complex environment<p>Easier to use platforms like Clarifai do not seem to be very common (not even mentioning the lay-offs) and ends up in a vendor lock-in situation where you HAVE to contract them if you want to go further with models tuning.<p>What is your take on the computer vision market? Do you think there's a need for an in-between platform (ease of use of Clarifai but with Sagemaker's flexibility and tweakability)?<p>If so, would you imagine an obvious and ever-present use case?<p>And as a side question: if you compared current Computer Vision market value chain to another industry, what would it be?<p>Thanks :)
Hardest part of building computer vision models these days is collecting and annotating data. A self serve tool like sagemaker doesn't solve that problem.<p>As a computer vision company your best bet is to focus on a specific industry or a narrow problem and build fully featured products. This could be medical imaging, satelite imagery, face recognition, fashion, adult content filtering, face recognition, etc.<p>We're doing this for food. We built a consumer food logging app that's a great source of data to keep improving our recognition model. At this point we have millions of labeled images, something that would take most of our API customers years to do. To make integration easier we couple the recognition model with a food "knowledge graph" that includes nutrition facts, ingredients and dietary categorizations. It turns out that this is useful to a lot of healthcare, fitness, nutrition and CPG companies.
I think you're right that product-market fit isn't there, yet, except for the organization that hires a real CV expert. So far vendors have wishful thinking about what makes a "mimimum viable" platform.