The question depends what you mean by first principles. Usage of the phrase "first principles" has sprawled into many different things since (I think) Musk first mentioned it as a way to learn. The original, philosophical meaning of first principles meant a fundamental truth which could be used to derive others. Much of the philosophising of thinkers like Aristotle or Descartes was to uncover these truths (eg I think, therefore I am). In physics and other sciences, it means calculations using established laws, rather than approximations or assumptions. Then it got borrowed into certain circles of the tech crowd with the vague meaning of thinking about what's important or true and ignoring the rest. Then it trickled down into the learning/self help world as a hack of some sort to learn. If we take the original meaning of first principles, there aren't a great deal of absolute truths in machine learning. It is a very empirical, approximated and engineering oriented endeavor. Most of the research involves thinking of a new approach, building it and trying it on new datasets.<p>The other big question is why you want to learn it. If you want to learn ML in itself, than anything including the search algorithms (which used to be considered core to ML a long time ago) you mentioned is part of that. But if you want to learn ML to contribute to modern developments like LLMs, then search algorithms are virtually useless. If you aren't going to be engineering any ML or ML products, what you want is to gain some insight into it's future and the business of it. So learning things like transformer architecture is going to be far more unhelpful than say, reading about the economics of compute clusters.<p>Given the empirical/engineering quality of current ML, I'd say building it from scratch is really good for getting the handful of possible first principles (the fundamental functions involved, data cleaning, training, etc)