I see a lot of negativity about a product that isn't even clear yet, which is a shame. There are a lot of problems with AI development today, and if Modular fixes even a slice of issues, then that's pretty great. I have some skepticism too, but in the spirit of optimism, here's a list of things that make AI development difficult. Maybe they will fix some:<p>1. The iteration cycles are really slow. Waiting for models to train is O(days) for the most common models that can still do something useful.<p>2. It's very hard to predict whether a change will make your model better or worse. You typically have to try a bunch of things out the whole way through. Even a 10% idea success rate is pretty good, but it takes a lot of time and GPUs to find out which ideas do work.<p>3. The programming of a training pipeline is really error prone. The most common error is that your tensors are different sizes, but even the bolt-on static type systems don't help you prevent tensor size mismatch.<p>4. It's true that lock-in is pretty bad. If NVidia had real competition then maybe prices would come crashing down.<p>5. If the set of primitives in pytorch or tensorflow work for you, then great! But if you need new primitives, then you're crossing the Python-C++ boundary, learning CUDA, and also grappling with build systems (either praying that Python will find CUDA, or learning more than you ever wanted to know about Bazel).<p>6. All the data prep and final analysis tends to need specialized scripts and pipelines. All these tools are written by people who just want to get back to the model, and are run infrequently. You're lucky if the scripts still work today, and especially lucky if there are any tests at all.