I think "AI software engineer" is the wrong target for the current generation of models, though it's good for generating buzz.<p>I'm working on an open source, terminal-based tool that uses agents to build complex software (<a href="https://github.com/plandex-ai/plandex">https://github.com/plandex-ai/plandex</a>), and I have found that the results are generally much better when you target 80-90% of a task and then finish up the rest manually, as opposed to burning lots of time and tokens on trying to get the LLM to do it all end-to-end.