Ask HN: How many % of Machine Learning projects “fail”?

16 pointsby satishguptaover 4 years ago

There are all these reports/surveys claiming ~80% of ML projects fail:- Jan 2019: https://blogs.gartner.com/andrew_white/2019/01/03/our-top-data-and-analytics-predicts-for-2019/ Gartner predicted that through 2020, 80% of AI projects will remain alchemy, run by wizards and through 2022, only 20% of analytic insights will deliver business outcomes.- May 2019: https://content.alegion.com/dimensional-researchs-survey Dimensional Research - Alegion Survey reported 78% of AI or ML projects stall at some stage before deployment, and 81% admit the process of training AI with data is more difficult than they expected.- July 2019: https://venturebeat.com/2019/07/19/why-do-87-of-data-science-projects-never-make-it-into-production/ VentureBeat reported 87% of data science projects never make it into production.There used to be similar claims in 1990s about failure of Software Development projects:- 1994: https://www.standishgroup.com/sample_research_files/chaos_report_1994.pdf The Standish Group’s CHAOS Report in 1994 claimed that only 16% of the software projects succeeded.That claim was doubted: https://ieeexplore.ieee.org/document/1438340Overtime, CHAOS reports became nuanced: https://www.standishgroup.com/sample_research_files/CHAOSReport2015-Final.pdfFor building ML-based software applications/product-features, my questions are:- How "failure" is defined for ML projects, when to call a project failure?- Is failure rate really 80%?

4 comments

mokslyover 4 years ago

I work in the public sector in Europe. We’re regularly approached by various ML projects. From universities to IBM, and we don’t turn them down, because we genuinely never turn down potential. Especially not something that’s on top of Gartner lists, or on the minds of our CEOs because they regularly hear about it from various Deloitte/E&Y types. This may sound like me being condescending, but I’m not, the same thing happened with RPA, and we’re reaping huge benefits from that despite the technical weaknesses of it.We have yet to make anything work in the real world with AI or ML that wasn’t some sort of image recognition. We let IBM use Watson to come up with various forms of analytics, but it didn’t come up with anything our analytics team wasn’t already building and it was frankly far behind them. Even if it had matched them, or surpassed them, the licensing would’ve likely made us stick with humans because they were just cheaper.Eventually it may be good and cheap enough, but one of the key issues we’ve often run into is that we don’t have enough data of a high enough quality and that’s honestly not very likely to improve. Even with the EU focusing on common enterprise architectures, and suppliers adopting/accepting that a police CaseFile always has a specific recipe to follow, they don’t necessarily stick to it outside of the APIs.It’s hard to name a specific failure rate. Because as far as recognising images, and trolling through millions upon millions of casefiles in tiff format, it works flawlessly. But for everything else, our anecdotal failure rate is probably 100%.

superbcarrotover 4 years ago

There was a thread yesterday about how data science is done and my answer was that it's just too broad of a question to fit in a single comment. Same here - there is a lot of variability in how a data science / machine learning / AI project is defined in different companies.Some teams do research - it could be research in machine learning, or it could be research in using ML in some other problem domain. Depending on how that team is set up "production" might not even be a thing for them.Some teams build software with ML features. They may do some in-house research or reuse existing models/architectures. The ML feature can be a critical part of the product, or it can be a nice-to-have.Some teams neither do research, nor build software - they run their analysis and models interactively so to speak and the results get saved to a database/file and/or are used to create some report or inform some decision. This process can be internal to the operations of the company, applied to some other domain or it could be a piece of consulting.I'm sure people can come up with more ways to do this type of work. My point is that in each of those cases success is defined differently. So when some online article starts throwing numbers around, I'm very skeptical about how deep they actually looked into these parameters.

评论 #25871297 未加载

michaericalriboover 4 years ago

This doesn't directly answer your question, but, there's an important distinction to draw between two use-cases for "ML":1. ML is used to improve UX / ops 2. Other people's jobs depend on the quality of the ML output1 can encompass things like offering recommendations or optimizing click-through. 2 is things like analytics-as-a-product in smart manufacturing.I assume 1 is a lot easier to "succeed" at: if I receive a bad music recommendation, I just move onto the next one.2 is harder to do well, and the output quickly loses credibility when it's bad.That said, bad ML in high-stakes contexts (by which I mean, someone else is using the ML to do their job better) should get weeded out pretty quickly...so for the total effort and care taken to address 2, they may be more successful on average.

Jugurthaover 4 years ago

We've been through that and are building our platform to address some parts of this problem so projects have a higher likelihood of success.In our experience building machine learning products and executing projects for large organizations over many years, the problems are in different parts of the journey.There were problems with us, problems within the client, and problems in the interactions.The problems with us were related to the way we did machine learning projects. We did turn-key solutions, from data acquisition to training models, to building software for the client's domain experts not only to use these models, but train new ones with fresh data after we are gone.Some were pure academics who taught at universities, but relied on our colleagues to set up their environments, upgrade their systems, deploy their models, get them the data, etc.Others were good at developing software but had to be dragged into the ML realm to use models.We had different profiles. Some could do it all: from signal processing, to writing software for embedded devices for data acquisition, to setting up environments and infrastructure, to writing web and mobile applications, to deploying models and applications.There were a lot of frictions given that these profiles lived in different realms and were optimizing for different things and there had to be overlap to get things done, given that we were a tiny team that could not afford to hire specialists full time given that we were a consultancy company and our revenue was based on projects.But even when we had many projects, they were in parallel but with the same people. This is not easy as the same people working on six or seven different projects with different code bases can drive people nuts. The context switch, the email you send asking for data gets answered when you are working on something else. Scheduling meetings. Jumping from one to the other. The PhD tapping who wants to show results to the client and who taps on your shoulder to make that happen while you're busy deploying their earlier work to the client, which you can't figure out where it is because tracking experiments, if it was done, was ad-hoc. Excel sheets, logs, pen and paper, in their head, over email.The problems in the client organizations were especially when not all stakeholders are aligned. You are on a project where only executives are involved, and you ask for the involvement of their domain experts but get denied. So you build things based on what you "think" they would want, or you don't get data in a timely manner because you have to set a meeting with legal, security, data warehouse, sales, and marketing who get onboarded at a later stage, or a new privacy law gets passed and it changes everything.The people in the client organization having different agendas, or not committing to the projects because they resent not having been consulted, or people being afraid of this "AI thing replacing them" and starting to pull weird things, like not answering your emails, not giving you data, etc. Sometimes the right people you want to talk with who have deep domain expertise are not the ones you are talking with. Sometimes when you have the people with domain expertise you don't have the support of decision makers.We have changed the way we do business and insist right from the start that they be at the table or we won't do it, and we've had much better results that way. There's an effort to explain things that is extremely important: what is machine learning, what can be done, what we can't do, defining the problem and success criterias and keeping that definition alive through the project and not let other things creep in, agreeing on the deliverables. We've been having better results like that.The problem in the interactions is in the latency of communication sometimes. Some clients, even large ones, are super responsive and you have the full support of everyone. They know what they want, they know how bad they want it, and they'll make everything at your disposal to get things done. Top management is involved. People are helpful. You get to talk with everyone, and they have people who can work their magic to get you what you need and onboard their own people. Regular updates. A great cadence. You really see progress. We were successful in these contexts and they were happy. Repeat business in that case where they come with more problems, and you leverage the relations you have built during the previous projects. Everybody knows everybody. You basically are like colleagues. They have the stamina to see things through.So, with the interactions and the problems we faced with clients, we have fixed things upstream by explaining how things will happen, explaining that it is important that the problem be specified, and to get the metrics right, etc.With the problems we had ourselves, we're building our platform[0] to dramatically accelerate our execution. Collaborative notebooks, scheduled long-running notebooks, automatic experiment tracking for metrics, params, and models. One click deployments, live dashboard for monitoring.All these things required a tap on the shoulder, or were blocking steps. There was a lot of time wasted on environments breaking when a "data scientist" required the help of another colleague to fix their environment, or to deploy their model. The automatic tracking takes off the burden to remember to track experiment because we were tired of trying to remember which model was best, or which data it trained on, or which parameters were used, and which metrics it had. Or when a developer wanted to build a feature that relied on a model from another project but had to be dragged into the "machine learning realm" even to use that model.The short-term objective for the platform was to take the burden of what were doing manually, so we don't have to do it ourselves. This alone has dramatically increased our quality of life and reduced our stress level. We've also witnessed this effect in one of our current projects: what would have involved our people could already be done by the platform, so it prevented taps on their should with "new priority, folks...". This makes me happier.The mid-term objective is to improve our readiness and step-up our ML game and include best practices right into the platform, without becoming rigid, as one of the things we didn't like in other solutions is the rigidity. The rigidity of what a "pipeline is", the rigidity of the environment where you drag and drop things, etc.We're optimizing for getting things done, not to make the most appealing product visually, as stylesheets never held our machine learning projects back.- [0]: <a href="https://iko.ai" rel="nofollow">https://iko.ai</a>

4 comments

mokslyover 4 years ago

superbcarrotover 4 years ago

评论 #25871297 未加载

michaericalriboover 4 years ago

Jugurthaover 4 years ago