The project sounds quite interesting but I'm not sure running it is going to work!
The code `gpt_model = "gpt4_20231230_1106preview"` is not using a valid model name as best as I can tell, so it seems unlikely to work - from <a href="https://github.com/SakanaAI/DiscoPOP/blob/main/scripts/launch_evo.py#L15">https://github.com/SakanaAI/DiscoPOP/blob/main/scripts/launc...</a>
Unusually, the issue section doesn't exist so I can't provide feedback to them that way. But luchris429's repo does have it so will do so there.
Maybe it's dead code. Still, it's wrong.
They are very useful when ideating with a human. On their own they could veer off into uncertain territory, and likely make mistakes obvious to humans.
I'm sure LLMs can optimize the training of other LLMs (either by inventing new ways or fine tuning existing ones). But we can't predict whether this will result in a giant's leap in the field, or just small increments. That's the definition of singularity, isn't it?
A better question is "Can LLMs invent <i>anything</i>?"<p>Don't misunderstand, building systems models using existing system response as a way of analyzing those systems is a useful methodology and it makes some things otherwise tedious things not so tedious. Much like "high level" languages removed the tedium of writing in assembly code. But for the same reason that a compiler won't emit a new, more powerful, CPU instruction in its code generator, LLMs don't generate previously unseen system responses.
Of course, they can invent anything. A better question is how efficient? Because even with brute force you can invent anything: <a href="https://libraryofbabel.info/" rel="nofollow">https://libraryofbabel.info/</a>