This feels like an excellent demonstration of the limitation of zero-shot LLMs. It feels like the wrong way to approach this.<p>I'm no expert in the matter, but for "holistic" things (where there are a lot of cross-connections and inter-dependencies) it feels like a diffusion-based generative structure would be better-suited than next-token-prediction. I've felt this way about poetry-generation, and I feel like it might apply in these sorts of cases as well.<p>Additionally, this is a highly-specialized field. From the conclusion of the article:<p>> Overall we have some promising directions. Using LLMs for circuit board design looks a lot like using them for other complex tasks. They work well for pulling concrete data out of human-shaped data sources, they can do slightly more difficult tasks if they can solve that task by writing code, but eventually their capabilities break down in domains too far out of the training distribution.<p>> We only tested the frontier models in this work, but I predict similar results from the open-source Llama or Mistral models. Some fine tuning on netlist creation would likely make the generation capabilities more useful.<p>I agree with the authors here.<p>While it's nice to imagine that AGI would be able to generalize skills to work competently in domain-specific tasks, I think this shows very clearly that we're not there yet, and if one wants to use LLMs in such an area, one would need to fine-tune for it. Would like to see round 2 of this made using a fine-tuning approach.