Disclaimer - i am the author.
This is a pun on Dijkstra's "Go To Statement Considered Harmful".<p>In 2023, the term “machine learning pipeline” is excessively used as a way "to productionize ML models". However, there is widespread confusion about what a ML pipeline is - what its inputs and outputs are. A pipeline is a computer program that has clearly defined inputs and outputs and either runs on a schedule or is run continuously. The problem is that when you say to somebody “build a ML pipeline” you are not conveying any information apart from - you need to automate the execution of a program.
The goal of this article is to say - be precise about what you are talking about.<p>Are you creating features? - it is a feature pipeline
Are you training a model? - it is a training pipeline
Are you making predictions with a model? - it is an inference pipeline