Very cool. I've played around with something similar a few months ago (without the sandbox). The tricky part is to find a diff format that the AI can use intuitively to patch code in place, as otherwise you can blow out the token limits quite fast.<p>I wonder to what extent AI is going to kill comprehensibility of the infrastructure. Over the weekend I did a bit of hacking with a friend. He wants to move out of AWS to Hetzner to save money, but wasn't experienced with Linux sysadmin so I was teaching him some stuff, and simultaneously pitching a little prototype I made of a server config tool. It's sort of like a higher level Ansible: you specify what packages and containers you want, the URLs they should be served on, and it goes ahead and configures unattended upgrades, Docker and Traefik to deliver that outcome. The idea was to extend it to configuring backups with BorgBackup and other common sysadmin-y tasks.<p>He was enthused! The next day though, things changed. He'd signed up for GPT-4 and it'd just blasted out a pile of Ansible configs for him. He doesn't know Ansible but it didn't matter, he just iterated with the AI a few times and now the problem is solved.<p>This makes me wonder if there's much point anymore in improving the usability of systems software (programming languages, databases, operating systems, clouds, anything driven by textual configuration). A basic assumption that underlies making better tools is that user's time is valuable, that intuitive and simple systems therefore have value. But the AI's time isn't valuable. It can read some poorly written docs, then the source code of the system, then people getting frustrated on StackOverflow, synthesize all that together and spit out whatever pile of inscrutable config files are needed and it can do so within seconds.<p>Given this experience, if we extrapolate it forwards, then maybe within a couple of decades many organizations will be running on infrastructure that nobody understands at all. You get cases today where maybe individual programs or workflows are only understood by one guy, but those are extremes and it's understood that it's a bad situation. Maybe in future it'll be normal and to do anything companies will have to ask an AI to do it for them. We already see skills atrophy around things like UNIX sysadmin because nowadays everyone learns AWS instead, which is one reason they can charge so much money for it, so I think as people retire knowledge of how to run Linux systems will steadily disappear. But conceivably people will stop learning the cloud too, and then to do anything with servers it'll be AI or the highway.<p>You can also apply this to programming languages. Why engage in a high effort project like inventing a Java or Rust if an AI can spit out correct C++ for you, given enough tries at writing tests? Does the motivation to produce anything beyond incremental improvements to existing tools disappear now?<p>I keep flipping back and forth on the answer to this. On one hand, it seems pretty pointless to further develop this prototype now. Maybe Ansible or AI written shell scripts is the end of the road for making Linux easier to use. On the other hand, AI is remarkably human like. It can also make mistakes, and also benefits from well written docs, simple interfaces and good error messages. So maybe usability still has a purpose, albeit maybe we now need to master a new field of "usability studies for AI".